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(57) The present invention relates to a strain of M. 
bovis BCG or M. microti, wherein said strain has inte- 
grated part or all of the RD1 region responsible for en- 
hanced immunogenicity of the tubercle bacilli, especial- 
ly the ESAT-6 and CFP-1 0 genes. These strains will be 
referred as the M. bovis BCG::RD1 or M. microthRDl 



strains and are useful as a new improved vaccine for 
preventing tuberculosis and as a therapeutical product 
enhancing the stimulation of the immune system for the 
treatment of bladder cancer. 



< 

CO 
CO 



CO 



Q_ 
LU 



Printed by Jouve, 75001 PARIS (FR) 



BUS DOC ID: <EP 



1350839A1_L> 



* 



EP 1 350 839 A1 

Description 

[0001 ] Virulence associated regions are searched for a long time in Mycobacterium. The present invention concerns 
the identification of 2 genomic regions which are shown to be associated with a virulent phenotype in Mycobacteria 

5 and particularly in M. tuberculosis and in M. leprae. It concerns also the fragments of said regions. 

[0002] The two regions are known as RD1 and RD5 as disclosed in Molecular Microbiology (1999), vol. 32, pages 
643 to 655 (Gordon S.V. et al.). Both of these regions or at least one of them are absent from the vaccine strains of 
M. bovis BCG and in M. microti, strains found involved and used as live vaccines in the 1960's. 
[0003] Other applications which are encompassed by the present invention are related to the use of all or part of the 

10 said regions to detect virulent strains of Mycobacteria and particularly M. tuberculosis in humans and animals. The 
RD1 and RD5 are considered as virulence markers under the present invention. 

[0004] The recombinant Mycobacteria and particularly M. bovis BCG after modification of their genome by introduc- 
tion of all part of RD1 region and/or RD5 region in said genome can be used for the immune system of patients affected 
with a cancer as for example a bladder cancer. 
15 [0005] The present invention relates to a strain of M. bovis BCG or M. microti, wherein said strain has integrated 
part or all of the RD1 region responsible for enhanced immunogenicity to the tubercle bacilli, especially the genes 
coding ESAT-6 and CFP-10 antigenes. These strains will be referred as the M. bovis BCG::RD1 or M. microti\\KQ'\ 
strains and are useful as a new improved vaccine for prevention of tuberculosis infections and for treating superficial 
bladder cancer. 

20 [0006] Mycobacterium bovis BCG (bacille Calmette-Guerin) has been used since 1921 to prevent tuberculosis al- 
though it is of limited efficacy against adult pulmonary disease in highly endemic areas. Mycobacterium microti, another 
member of the Mycobacterium tuberculosis complex, was originally described as the infective agent of a tuberculosis- 
like disease in voles {Microtus agrestis) in the 1930's (Wells, A. Q. 1937. Tuberculosis in wild voles. Lancet 1221 and 
Wells, A. Q. 1946. The murine type of tubercle bacillus. Medical Research council special report series 259:1-42.). 

25 Until recently, M. microti strains were thought to be pathogenic only for voles, but not for humans and some were even 
used as a live-vaccine. In fact, the vole bacillus proved to be safe and effective in preventing clinical tuberculosis in a 
trial involving roughly 10,000 adolescents in the UK in the 1950's (Hart, P. D. a., and I. Sutherland. 1977. BCG and 
vole bacillus vaccines in the prevention of tuberculosis in adolescence and early adult life. British Medical Journal 2: 
293-295). At about the same time, another strain, OV166 : was successfully administered to half a million newborns in 

30 Prague, former Czechoslovakia, without any serious complications (Sula, L, and I. Radkovsky. 1 976. Protective effects 
of M. microti vaccine against tuberculosis. J. Hyg. Epid. Microbiol. Immunol. 20:1-6). M. microti vaccination has since 
been discontinued because it was no more effective than the frequently employed BCG vaccine. As a result, improved 
vaccines are needed for preventing and treating tuberculosis. 

[0007] The problem for attempting to ameliorate this live vaccine is that the molecular mechanism of both the atten- 
ds uation and the immunogenicity of BCG,is still poorly understood. Comparative genomic studies of all six members of 
the M. tuberculosis complex have identified more than 140 genes, whose presence is facultative, that may confer 
differences in phenotype, host range and virulence. Relative to the genome of the paradigm strain, M. tuberculosis 
H37Rv (S. T. Cole, etai, Nature 393, 537(1998)), many of these genes occur in chromosomal regions that have been 
deleted from certain species (RD1-16, RvD1-5), M. A. Behr, etal., Science 284, 1520 (1999); R. Brosch, et al., Infection 
40 Immun. 66, 2221 (1998); S. V. Gordon, et al., Molec Microbiol 32, 643 (1999) ; H. Salamon, et al, Genome Res 10, 
2044 (2000), G. G. Mahairas et al, J. Bacteriol. 178, 1274 (1996) and R. Brosch, et al., Proc Natl Acad Sci USA 99, 
3684 (2002). 

[0008] In connection with the invention and based on their distribution among tubercle bacilli and potential to encode 
virulence functions, RD1, RD3-5, RD7 and RD9 (Fig. 1A, B) were accorded highest priority for functional genomic 
45 analysis using "knock-ins" of M. bovis BCG to assess their potential contribution to the attenuation process. Clones 
spanning these RD regions were selected from an ordered M. tuberculosis H37Rv library of integrating shuttle cosmids 
(S. T. Cole, et aL, Nature 393, 537 (1998) and W. R. Bange, et al, Tuber. Lung Dis. 79, 171 (1999);, and individually 
electroporated into BCG Pasteur, where they inserted stably into the attB site (M. H. Lee, et al, Proc. Natl. Acad. Sci. 
USA SB. 3111 (1991)| 

so [0009] We have uncovered that only reintroduction of RD1 led to profound phenotypic alteration. Strikingly, the BCG:: 
RD1 "knock-jn D grew more vigorously than BCG controls in immuno-deficient mice, inducing extensive splenomegaly 
and granuloma formation. 

[0010] RD1 is restricted to the avirulent strains M. bovis BCG and M. microti. Although the endpoints are not identical, 
the deletions have removed from both vaccine strains a cluster of six genes (Rv3871 -Rv3876) that are part of the 
55 ESAT-6 locus (Fig. 1 A (S. T Cole, et al., Nature 393, 537 (1 998) and F. Tekaia, et al., Tubercle Lung Disease 79, 329 
(1999)). 

[0011] Among the missing products are members of the mycobacterial PE (Rv3872), PPE (Rv3873), and ESAT-6 
(Rv3874, Rv3875) protein families. Despite lacking obvious secretion signals, ESAT-6 (Rv3875) and the related protein 
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CFP-10 (Rv3874), are abundant components of short-term culture filtrate, acting as immunodominant T-cell antigens 
that induce potent Th1 responses (F. Tekaia, et al., Tubercle Lung Disease 79,329 (1999) ; A. L. Sorensen, etal, Infect. 
Immun. 63, 1710 (1995) and R. ColangellL et aL Infect. Immun. 68, 990 (2000)). 

[0012] In summary, we have discovered that the restoration of RD1 to M. bovis BCG leads to increased persistence 
5 in immunocompetent mice. The M. bovis BCG::RD1 strain induces RD1-specific immune responses of the Th1-type, 
has enhanced immunogenicity and confers better protection than M. bovis BCG alone in the mouse model of tuber- 
culosis. The M. bovis BCG::RD1 vaccine is significantly more virulent than M. bovis BCG in immunodeficient mice but 
considerably less virulent than M. tuberculosis. 

[0013] In addition, we show that M. microti lacks a different but overlapping part of the RD1 region (RD1 mic ) to M. 
10 bovis BCG and our results indicate that reintroduction of RD1 confers increased virulence of BCG ::RD1 in immuno- 
deficient mice. The rare strains of M. microti that are associated with human disease contain a region referred to as 
RD5 mic whereas those from voles do not. 

[0014] M. bovis BCG vaccine could be improved by reintroducing other genes encoding ESAT-6 family members 
that have been lost, notably, those found in the RD8 and RD5 loci of M. tuberculosis. These regions also code for 
15 additional T-cell antigens. 

" [0015] M: bovis BCG::RD1 could be improved by reintroducing the RD8 and RD5 loci of M. tuberculosis. 

[0016] M. bovis BCG vaccine could be improved by overexpressing the genes contained in the RD1 , RD5 and RD8 
regions. 

[0017] Accordingly, these new strains, showing greater persistence and enhanced immunogenicity, represent an 
20 improved vaccine for preventing tuberculosis and treating bladder cancer. 

[0018] In addition, the greater persistence of these recombinant stains is an advantage for the presentation of other 
antigens, for instance from HIV in humans and in order to induce protection immune responses. Those improved strains 
may also be of use in veterinary medicine, for instance in preventing bovine tuberculosis. 

25 Description 

[0019] Therefore, the present invention is aimed at a strain of M. bovis BCG or M. microti, wherein said strain has 
integrated all or part of the RD1 region responsible for enhanced immunogenicity to the tubercle bacilli. These strains 
will be referred as the M. bovis BCG::RD1 or M. microtr.RDl strains. 

30 [0020] In connection with the invention, "part or all of the RD1 region" means that the strain has integrated a portion 
of DNA originating from Mycobacterium tuberculosis, which comprises at least one gene selected from Rv3871 , Rv3872 
(mycobacterial PE), Rv3873 (PPE), Rv3874 (CFP-1 0), Rv3875 (ESAT-6), and Rv3876. The expression gene is referred 
herein as the coding sequence in frame with its natural promoter as well as the coding sequence which has been 
isolated and framed with an exogenous promoter, for example a promoter capable of directing high level of expression 

35 of said coding sequence. , 

[0021] In a specific aspect, the invention relates to a strain of M. bovis BCG or M. microti wherein said strain has 
integrated at least one gene selected from Rv3871 , Rv3872 (SEQ ID No 2, mycobacterial PE), Rv3873 (SEQ ID No 
3, PPE), Rv3874 (SEQ ID No 4, CFP-10), Rv3875 (SEQ ID No 5, ESAT-6), and Rv3876, preferably CFP-10, ESAT-6 
or both (SEQ ID No 6). 

40 [0022] These genes can be mutated (deletion, insertion or base modification) so as to maintain the improved immu- 
nogenicity while decreasing the virulence of the strains. Using routine procedure, the man skilled in the art can select 
the M. bovis BCG::RD1 or M. microtr.RQl strains, in which a mutated gene has been integrated, showing improved 
immunogenicity and lower virulence. 

We have shown here that introduction of the RD1 region makes the vaccine strains induce a more effective immune 

45 response against a challenge with M. tuberculosis. 

However, this first generation of .constructs can be followed by other, more f ine-tuned generations of constructs as the 
complemented BCG::RD1 vaccine strain also showed a more virulent phenotype in severely immuno-compromised 
(SCID) mice. Therefore, the BCG RD1+ constructs may be modified to as to be applicable as vaccine strains while 
being safe for immuno-compromised individuals. 

50 [0023] In this perspective, the man skilled in the art can adapt the BCG: :RD 1 strain by the conception of BCG vaccine 
strains that only carry parts of the genes coding for ESAT-6 or CFP-1 0 in a mycobacterial expression vector (for example 
pSM81) under the control of a promoter, more particularly an hsp60 promoter. For example, at least one portion of the 
esat-6 gene that codes for immunogenic 20-mer peptides of ESAT-6 active as T-cell epitopes (Mustafa AS, Oftung F, 
Amoudy HA, Madi NM, Abal AT, Shaban F, Rosen Krands I. & Andersen P. (2000) Multiple epitopes from the Myco- 

55 bacterium tuberculosis ESAT-6 antigen are recognized by antigen-specific human T cell lines. Clin Infect Dis. 30 Suppl 
3:S201-5, peptides P1 to P8 are incorporated herein in the description) could be cloned into this vector and electro- 
porated into BCG, resulting in a BCG strain that produces these epitopes. 

[0024] Alternatively, the ESAT-6 and CFP-10 encoding genes (for example on plasmid RD1-AP34 and or RD1-2F9) 
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could be altered by directed mutagenesis (using for example QuikChange Site-Directed Mutagenesis Kit from Strata, 
gen) in a way that most of the immunogenic peptides of ESAT-6 remain intact, but the biological functionality of ESAT- 
6 is lost 

This approach could result in a more protective BCG vaccine without increasing the virulence of the recombinant BCG 

construct. . 
[00251 Therefore, the invention is also aimed at a method for preparing and selecting M. bovis BCG or M. rrncrot, 
strains comprising a step consisting of modifying the M. bovis BCG::DR1 or M. microti strains as defined above 
bv insertion deletion or mutation in the integrated DR1 region, more particularly in the esat-6 or CFP-10 gene, said 
method leading to strains that are less virulentfor immuno-depressed individuals. Together, these methods would allow 
to explain what causes the effect that we see with our BCG::RD1 strain (the presence of additional T-cell eprtopes from 
ESAT-6 and CFP10 resulting in increased immunogenicity) or whether the effect is caused by better fitness of the 
recombinant BCG::RD1 clones resulting in longer exposure time of the immune system to the vaccine - or - by a 
combinatorial effect of both factors. 

rO026] In a preferred embodiment, the invention is aimed at the M. bovis BCG::RD1 strains, which have in tegrated 
a cosmid herein referred to as the RD1-2F9 and RD1-AP34 contained in the E. coli strains deposited on April 2, 2002 
at the CNCM (Institut Pasteur, 25, rue du Docteur Roux, 75724 Paris cedex 15, France) under the accession number 
1-2831 and 1-2832 respectively. The RD1-2F9 is a cosmid comprising a portion of the Mycobacterium tuberculosa 
H37RV genome that spans the RD1 region and the hygromycin resistance gene. The RD1 -AP34 is a cosmid comprising 
a portion of the Mycobacterium tuberculosis DN A containing two genes coding for ESAT-6 and CFP-10 as well as a 
qene conferrinq resistance to Kanamycin. 

r00271 The construct RD1-AP34 contains a 3909 bp fragment of the M. tuberculosis H37Rv genome from reg.on 
4350459 bp to 4354367 bp cloned into an integrating vector pKint (SEQ ID No 1). The Accession No. of the segment 
1 60 of the M. tuberculosis H37Rv genome that contains this region is AL022120. 



SEQ ID No 1 : 

1 - gaattcccat ccagtgagtt caaggtcaag cggcgccccc ctggccaggc atttctcgtc 
61 - tcgccagacg gcaaagaggt catccaggcc ccctacatcg agcctccaga agaagtgttc 
121 - gcagcacccc caagcgccgg ttaagattat ttcattgccg gtgtagcagg acccgagctc 
181 - agcccggtaa tcgagttcgg gcaatgctga ccatcgggtt tgtttccggc tataaccgaa 
241 - cggtttgtgt acgggataca aatacaggga gggaagaagt aggcaaa/gg aaaaaatgic 
301 - acatgaiccg atcgctgccg acatiggcac gcaagtgagc gacaacgctc tgcacggcgt 
361 - gacggccggc icgacggcgc tgacgtcggt gaccgggctg gttcccgcgg gggccgatga 
421 - ggtctccgcc caagcggcga cggcgttcac alcggagggc alccaatlgc tggctlccaa 
481 - tgcatcggcc caagaccagc (ccaccglgc gggcgaagcg gtccaggacg tcgcccgcac 
541 - ctattcgcaa atcgacgacg gcgccgccgg cgtcitcgcc gaataggccc ccaacacatc 
601 - ggagggagtg atr.acc afgc tgtggcacgc aatgccaccg gagctaaata ccg cacggct 
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661 - gatggccggc gcgggtccggctccaatgct tgcggcggcc gcggga tggc agacgctttc 

721 - ggcggctctg gacgctcagg ccgtcgagtt gaccgcgcgc ctgaactctc tgggagaagc 

781 - ctggactgga ggtggcagcg acaaggcgct tgcggctgca acgccgatgg tggtctggct 

841 - acaaaccgcg toaacacagg ccaagacccg tgcgatgcag gcgacpgcgc aagccgcggc 

901 - atacacccag gccatggcca cgacgccgtc gctgccggag atcgccgcca accacatcac 

96) - ccaggccgtc cttacggcca ccaacttctt cggtatcaac acgatcccga tcgcgttgac 

1021 - cgagatggat tatttcatcc gtatgtggaa ccaggcagcc ctggcaatgg aggtctacca 

1081 - ggccgagacc gcggttaaca cgcttttcga gaaectcga g ccgatg gcgt cgatccttga 

1 141 - tcccggcgcg agccagagca cgacgaaccc gatcttcgga atgccctccc ctggcagctc 

1201 - aacaccggtt ggccagttgc cgccggcggc tacccagacc ctcggccaac tgggtgagat 

1261 - gaecggcccg atgcagcagc tgacccagcc gctgcag cag gtgacgtcgt tgttcagcca 

132] - ggtgggcggc accggcggcg gcaacccagc cgacgaggaa gccgcgcaga tgggcctgct 

1381 - cggcaccagt ccectgtcga accatccgct ggctggtgga tcaggcccca gcg cgggcgc 

1441 - gggcctgctg cgcgcggagt cgctacctgg cgcaggtggg tcgttgaccc gcacgccgct 

1501 - gatgtctcag ctgatcgaaa agccggttgc cccctcggtg atgccggcgg ctgctgccgg 

1561 - atcgtcggcgacgggtggcgccgctccggt gggtgcggga gcgatgggcc agggtgcgca 

1621 - atccggcggc tccaccaggc cgggtctggt cgcgccggca ccgctcgcgc aggagcgtga 

1 68 1 - agaagacgac gaggacgact gggacgaaga ggacgactgg tgagctcccg taalgacaac 

1741 - agacttcccg gccacccggg ccggaagact tgccaacatt ttggcgagga aggtaaagag 

1801 - agaaagtagt ccagcatggc agagatgaag accgatgccg ctaccctcgc gcaggaggca 

* 

1861 - ggtaatttcg agcggatctc cggcgacctg aaaacccaga tcgaccaggt ggagtcgacg 
1 92 1 - gcaggttcgt tgcagggcca gtggcgcggc gcggcgggga cggccgccca ggccgcggtg 
1981 - gtgcgcttcc aagaagcagc caataagcag aagcaggaac tcgacgagat ctcgacgaal 
2041 - attcgtcagg ccggcgtcca atactcgagg gccgacgagg agcagcagca ggcgctgtcc 
2101 - tcgcaaatgg gcttctgacc cgctaatacg aaaagaaacg gagcaaaaac atzacQgagc 
2161 - agcagt£gaa tttcgcgggt atceazzccz czzcaazcec aatccazppa aatgtcacst 
2221 - ccattcaitc cctccttsac pazvzzaaec aztccctzae caazctceca gcggccteee 
2281 - ecggtogcgg Ucggasscg taccazsztg tccagcauaa ateesacycc acssctaccz 
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2341 - goctzaacaa czczctzcaz aacctzzc%c ggacgatcag csaazcczzt caggcg atgg 

2401 - cttczaccza azzcaacztc qctgggatgi /ceca taggg caacgccgag ttcgcgtaga 

2461 - atagcgaaac acgggatcgg gcgagttcga ccttccgtcg gtctcgccct ttctcgtgtt 

2521 - tatacgtttg agcgcactct gagaggttgt catggcggcc gactacgaca agctcttccg 

258 1 - gccgcacgaa ggtatggaag ctccggacga tatggcagcg cagccgttct tcgaccccag 

2641 - tgcttcgttt ccgccggcgc ccgcatcggc aaacctaccg aagcccaacg gccagactcc 

2701 - gcccccgacg tccgacgacc tgtcggagcg gttcgtgtcg gccccgccgc cgccaccccc 

2761 - acccccacct ccgcctccgc caactccgat gccgatcgcc gcaggagagc cgccctcgcc 

2821 - ggaaccggcc gcatctaaac cacccacacc ccccatgccc atcgccggac ccgaaccggc 

2881 - cccacccaaa ccacccacac cccccalgcc catcgccgga cccgaaccgg ccccacccaa 

2941 - accacccaca cctccgatgc ccatcgccgg acctgcaccc accccaaccg aatcccagtt 

3001 - ggcgcccccc agaccaccga caccacaaac gccaaccgga gcgccgcagc aaccggaatc 

3061 - accggcgccc cacgtaccct cgcacgggcc acatcaaccc cggcgcaccg caccagcacc 

3121 - gccctgggca aagatgccaa tcggcgaacc cccgcccgct ccgtccagac cgtctgcgtc 

3181 - cccggccgaa ccaccgaccc ggcctgcccc ccaacactcc cgacgtgcgc gccggggtca 

3241 - ccgctatcgc acagacaccg aacgaaacgt cgggaaggta gcaactggtc catccatcca 

3301 - ggcgcggctg cgggcagagg aagcatccgg cgcgcagctc gcccccggaa cggagccctc 

3361 - gccagcgccg ttgggccaac cgagatcgta tctggctccg cccacccgcc ccgcgccgac 

3421 - agaacctccc cccagcccct cgccgcagcg caaclccggt cggcgtgccg agcgacgcgt 

3481 - ccaccccgat ttagccgccc aacatgccgc ggcgcaacct gattcaatta cggccgcaac 

* 

3541 - cactggcggt cgtcgccgca agcgtgcagc gccggatctc gacgcgacac agaaatcctt 
3601 - aaggccggcg gccaaggggc cgaaggtgaa gaaggtgaag ccccagaaac cgaaggccac 
3661 - gaagccgccc aaagtggtgt cgcagcgcgg ctggcgacat tgggtgcatg cgttgacgcg 
3721 - aatcaacctg ggcctgtcac ccgacgagaa gtacgagctg gacctgcacg ctcgagtccg 
3781 - ccgcaatccc cgcgggtcgt atcagatcgc cgtcgtcggt ctcaaaggtg gggctggcaa 
3841 - aaccacgctg acagcagcgt tggggtcgac gttggctcag gtgcgggccg accggatcct 
3901 - ggctclaga , 



pos, 0001-0006 EcoRI-restriction site 

pos. 0286-0583 Rv3872 coding for a PE-Protein (SEQ ID No 2) 
pos. €616-1720 Rv3873 coding for a PPE-Protein (SEQ ID No 3) 
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pos. 1816-2115 Rv3874 coding for Culture Filtrat protein 10RD (CFP10) (SEQ ID 
No 4) 

5 

pos. 2151-2435 Rv3875 coding for Early Secreted Antigen Tarsal 6kD (ESAT6) (SEP 
ID No 5) 

10 

pos. 3903-3609 Xbal-restriction site 

pos. 1816-2435 CFP-10 gene + esat-6 gene (SEQ ID No 6) 

15 

[0028] The sequence of the fragment RD1 -2F9 (~ 32 kb) covers the region of the M. tuberculosis genome AL1 23456 
from ca 4337 kb to ca. 4369 kb, and aiso contains the sequence described above. 

[0029] Such strains fulfill the aim of the invention which is to provide an improved tuberculosis vaccine or M. bovis . 

20 BCG-based prophylactic or therapeutic agent, or a recombinant M. microti derivative for these purposes. 

[0030] The above described M. bovis BCG::RD1 strains are better tuberculosis vaccines than M. bovis BCG. These 
strains can also be improved by reintroducing other genes found in the RD8 and RD5 loci of M. tuberculosis. These 
regions code for additional T-cell antigens. As indicated, overexpressing the genes contained in the RD1 , RD5 and 
RD8 regions by means of exogenous promoters is encompassed by the invention. The same applies regarding M. 

25 m/crof/::RD1 strains. M. microti strains could also be improved by reintroducing the RD8 locus of M. tuberculosis. 

[0031] In a second embodiment, the invention is directed to acosmid or a plasmid comprising part or all of the RD1 
region originating from Mycobacterium tuberculosis, said region comprising at least one gene selected from Rv3871 , 
Rv3872 (mycobacterial PE), Rv3873 (PPE), Rv3874 (CFP-10), Rv3875 (ESAT-6), and Rv3876. Preferably, such cos- 
mids or plasmid comprises CFP-10, ESAT-6 or both. The invention also relates to the use of these cosmids or plasmids 

30 for transforming M. bovis BCG or M. microti strains. As indicated above, these cosmids or plasmids may comprises a 
mutated gene selected from Rv3871 to Rv3876, said mutated gene being responsible for the improved immunogenicity 
and decreased virulence. 

[0032] In another embodiment, the invention embraces a pharmaceutical composition comprising a strain as depicted 
above and a pharmaceutically acceptable carrier. 

35 [0033] In addition to the strains, these, pharmaceutical compositions may contain suitable pharmaceutically-accept- 
able carriers comprising excipients and auxiliaries which facilitate processing of the living vaccine into preparations 
which can be used pharmaceutically. Further details on techniques for formulation and administration may be found 
in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing Co., Easton, Pa.). 
[0034] Preferably, such composition is suitable for oral intravenous or subcutaneous administration. 

40 [0035] The determination of the effective dose is well within the capability of those skilled in the art. A therapeutically 
effective dose refers to that amount of active ingredient, i.e the number of strains administered, which ameliorates the 
symptoms or condition. Therapeutic efficacy and toxicity may be determined by standard pharmaceutical procedures 
in experimental animals, e.g., ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose 
lethal to 50% of the population). The dose ratio of toxic to therapeutic effects is the therapeutic index, and it can be 

45 expressed as the ratio, LD50/ED50 . Pharmaceutical compositions which exhibit large therapeutic indices are preferred. 
Of course, ED50 is to be modulated according to the mammal to be treated or vaccinated. In this regard, the invention 
contemplates a composition suitable for human administration as well as veterinary composition. 
[0036] The invention is also aimed at a vaccine comprising a M. bovis BCG::RD1 or M. microti::HD"\ strain as depicted 
above and a suitable carrier. This vaccine is especially useful for preventing tuberculosis. It can also be used fortreating 

50 bladder cancer. 

[0037] The invention also concerns a product comprising a strain as depicted above and at least one protein selected 
from ESAT-6 and CFP-1 0 or epitope derived thereof for a separate, simultaneous or sequential use for treating tuber- 
culosis. 

[0038] In still another embodiment, the invention concerns the use of a M. bovis BCG::RD1 or M. microtr.:RD"\ strain 
55 as depicted above for preventing or treating tuberculosis. 

It also concerns the use of a M. bovis BCG::RD I or M. microt'r.HD^ strain as a powerful adjuvant/immunomodulator 
used in the treatment of superficial bladder cancer. 

[0039] The invention also contemplates the identification at the species level of members of the M. tuberculosis 
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30 



complex by means of an RD-based molecular diagnostic test. Inclusion of markers for RD1*» and RD5™ would 
improve the tests and act as predictors of virulence, especially in humans. In this regard, the invention concerns a 
diagnostic kit comprising DNA r-obes and primers specifically hybridizing to a DNA portion of the ; -.D1 or RD5 region 
more particularly probes hybrid .ng under stringent conditions to a gene selected from Rv3871 Rvo*72 (mycobacterial 
PE) RV3873 (PPE) RV3874 (CFP-1 0), Rv3875 (ESAT-6), and Rv3876, preferably CFP-1 0 and ESAT-6. As used herein, 
the term "stringent conditions" refers to conditions which permit hybridization between the probe sequences and the 
polynucleotide sequence to be detected. Suitably stringent conditions can be defined by, for example, the concentra- 
tions of salt orformamide in the prehybridization and hybridization solutions, or by the hybridization temperature, and 
are well known in the art. In particular, stringency can be increased by reducing the concentrat.on of salt, increasing 
the concentration of formamide, or raising the hybridization temperature. The temperature range corresponding to a 
particular level of stringency can be further narrowed by calculating the purine to pyrimidine ratio of the nuc etc acid of 
interest and adjusting the temperature accordingly. Variations on the above ranges and conditions are well known in 
the art. 

[0040] Among the preferred primers, we can cite: 



primer esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9), 



20 pnmer 



primer 



esat-6R ATCCCAGTGACGTTGCCTT) (SEQ ID No 10), 

RDl m,c flanking region F GCAGTGCAAAGGTGCAGATA (SEQ ID No 11), 



primer RDl™ flanking region R GATTGAGACACTTGCCACGA (SEQ ID No 12), 



pnmer 



RD5"" C flanking region F GAATGCCGACGTCATATCG (SEQ ID No 16), 



primer RD5 mi = flanking region R CGGCCACTGAGTTCGATTAT (SEQ ID No 17). 

35 r0041] The present invention covers^also the commentary nucleotidic sequences of said above primers as well as 
the nucleotidic sequences hybridizing under stringent conditions with them and having at least 20 nucleotides and less 

r0042T°DSostic S kits for the identification at the species level of members of the M. tuberculosis comprising anti- 
bodies directed to mycobacterial PE, PPE, CFP-1 0 and ESAT-6 are also embraced by the invention. As used herein 
40 the term "antibody" refers to intact molecules as well as fragments thereof, such as Fab, F(ab ).sub.2, and Fv, which 
are capable of binding the epitopic determinant. Probes or antibodies can be labeled with isotopes, fluorescent or 
phosphorescent molecules or by any other means known in the art. 

[0043] The invention is further detailed below and will be illustrated with the following figures. 
45 Figure legends 
[0044] 

Figure 1 : M. bovis BCG and M. microti have a chromosomal deletion, RD1 , spanning the cfp10-esat6 locus. 

50 

(A) Map of the cf P 10-esat6 region showing the six possible reading frames and the M. tuberculosis H37Rv 
qene predictions. This map is also available at: (http://genolist.pasteur.fnTuberc uList/). 

The deleted regions are shown for BCG (red) and M. microti (blue) with their respective H37Rv genome 
coordinates, and the extent of the conserved ESAT-6 locus (f. Tekaia, etal., Tubercle Lung Dtsease 79, 329 
55 (1 999)), is indicated by the gray bar. 

(B) Table showing characteristics of deleted regions selected for complementation analysis. Potential virulence 
factors and their putative functions disrupted by each deletion are shown. The coordinates are for the M. 
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tuberculosis H37Rv genome. 

(C) Clones used to complement BCG. Individual clones spanning RD1 regions (RD1 -1106 and RD1-2F9) were 
selected from an ordered M tuberculosis genomic library (R.B. unpublished) in pYUB412 (S. T. Cole, et al., 
Nature 393, 537 (1998) and W. R. Bange, F. M. Collins, W. R. Jacobs, Jr., Tuber. Lung Dis. 79, 171 (1999)) 
and electroporated into M. bovis BCG strains, or M. microti. Hygromycin-resistant transformants were verified 
using PCR specific for the corresponding genes. pAP35 was derived from RD1-2F9 by excision of an AffU 
fragment. pAP34 was constructed by subcloning an EcoR\-Xba\ fragment into the integrative vector pKINT. 
The ends of each fragment are related to the BCG RD1 deletion (shaded box) with black lines and the H37Rv 
coordinates for the other fragment ends given in kilobases. 

(D) Immunoblot analysis, using an ESAT-6 monoclonal antibody, of whole cell protein extracts from log-phase 
cultures of H37Rv (S.T. Cole, era/., Nature393 } 537 (1998)), BCG::pYUB412 (M. A. Behr, era/., Sc/ence284, 
.1520 (1999)), BCG::RD1-I106 (R. Brosch ; et al., Infection Immun. 66, 2221 (1998)), BCG::RD1-2F9 (S. V. 
Gordon, et a/., Molec Microbiol 32, 643 (1999)), M. bovis (H. Salamon et al, Genome Res 10, 2044 (2000)), 
Mycobacterium smegmatis (G. G. Mahairas, et al, J. Bacteriol. 178, 1274 (1996)), M. smegmatis: :pYUB41 2, 
and M. smegmatis:: RD1 -2F9 (R. Brosch, et at., Proc Natl Acad Sci USA 99, 3684 (2002)). 

Figure 2: Complementation of BCG Pasteur with the RD1 region alters the colony morphology and leads 
to accumulation of Rv3873 and ESAT-6 in the cell wall. 

(A) Serial dilutions of 3 week old cultures of BCG::pYUB412, BCG::1106 or BCG::RD1-2F9 growing on Mid- 
dlebrook 7H1 0 agar plates. The white square shows the area of the plate magnified in the image to the right. 

(B) Light microscope image at fifty fold magnification of BCG::pYUB412 and BCG::RD1-2F9 colonies. 5 jxl 
drops of bacterial suspensions of each strain were spotted adjacently onto 7H10 plates and imaged after 10 
days growth, illuminating the colonies through the agar. 

(C) Immunoblot analysis of different cell fractions of H37Rv obtained from http://wvtfw.cvmbs.colostate.edu/ 
microbiology/tb/ResearchMA.html using either an anti-ESAT-6 antibody or 

(D) anti-Rv3873 (PPE) rabbit serum. H37Rv and BCG signify whole cell extracts from the respective bacteria 
and Cyt, Mem and CW correspond to the cytosolic, membrane and cell wall fractions of M. tuberculosis H37Rv. 

Figure 3: Complementation of BQG Pasteur with the RD1 region increases bacterial persistence and path- 
ogenicity in mice. 

(A) Bacteria in the spleen and lungs of BALB/c mice following intravenous (i.v.) infection via the lateral tail 
vein with 10 6 colony forming units (cfu) of M tuberculosis H37Rv (red) or 10 7 cfu of either BCG::pYUB412 
(yellow) or BCG::RD1-I106 (green). 

(B) Bacterial persistence in the spleen and lungs of C57BL/6 mice following i.v. infection with 10 5 cfu of BCG:: 
pYUB412 (yellow), BCG::RD1 -1106 (green) or BCG::RD1-2F9 (blue). 

(C) Bacterial multiplication after i.v. infection with 10 6 cfu of BCG::pYUB412 (yellow) and BCG::RD1-2F9 (blue) 
in severe combined immunodeficiency mice (SCID). For A, B, and C each timepoint is the mean of 3 to 4 mice 
and the error bars represent standard deviations. 

(D) Spleens from SCID mice three weeks after i.v. infection with 10 6 cfu of either BCG::pYUB412, BCG:: 
RD1-2F9 or BGG::I301 (an RD3 "knock-in n , Fig. 1B). The scale is in cm. 

Figure 4: Immunisation of mice with BCG::RD1 generates marked ESAT-6 specific T-cell responses and 
enhanced protection to a challenge with M. tuberculosis. 

(A) Proliferative response of splenocytes of C57BL/6 mice immunised subcutaneously (s.c.) with 10^ CFU of 
BCG::pYUB412 (open squares) or BCG::RD1-2F9 (solid squares) to in vitro stimulation with various concen- 
trations of synthetic peptides from poliovirus type 1 capsid protein VP1, ESAT-6 or Ag85A (K. Huygen, et al., 
Infect Immun. 62, 363 (1994), L. Brandt, J.Immunol. 157, 3527 (1996) and C. Leclerc et al, J. Virol. 65, 711 
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(1991)). 



(B) Proliferation of splenocytes from BCG::RDI-2F9-immunised mice in the absence or presence of 0 up/ml 
of ESAT 6 1 -20 peptide, with or without 1 ug/ml of anti-CD4 (GK1 .5) or anti-CD8 (H35-1 7-2) monodonal an- 
tibody. Results are expressed as mean and standard deviation of 3 H -thymidine incorporat.on from duplicate 



wells. 



(C) Concentration of I FN -7 in culture supernatants of splenocytes of C57BL76 mice s imulated I for 72 h wrth 
peptides or PPD after s c. or i.v. immunisation with either BCG:: P YUB412 (red and yellow) or BC&:RD1-2F9 
(areen and blue). Mice were inoculated with either 10* (yellow and green) or 10'(red and blue) cfu. Levels of 
FN-v were quantified using a sandwich ELISA (detection limit of 500 pg/ml) with the mAbs R4-6A2 and biot.n- 
conjugated XMG1.2. Results are expressed as the mean and standard deviation of duplicate culture wells. 

(D) Bacterial counts in the spleen and lungs of vaccinated and unvaccinated BALB/c mice 2 months after an 
> v challenge with M. tuberculosis H37Rv. The mice were challenged 2 months after i.v. .noculat,on with 0 

cfu of either BCG pYUB412 or BCG::RD1-2F9. Organ homogenates for bacterial enumerate were plated 
on 7H11 medium, with or without hygromycin, to differentiate M. tuberculosis from residual BCG colonies 
Results are expressed as the mean and standard deviation of 4 to 5 mice and the levels of significance derived 
using the Wilcoxon rang sum test. 

Figure 5: Mycobacterium microti strain OV254 BAC map (named MiXXX), overlaid on the M. tuberculosis H37Rv 
(named RvXXX) and M. bovis AF21 22/97 (named MbXXX) BAC maps. The scale bars indicate the posrtion on the 
M. tuberculosis genome. 

Figure 6: Difference in the region 4340-4360 Kb between the deletion in BCG RD1 ^ (A) and in M. microti RD1 «* 
(C) relatively to M. tuberculosis H37Rv (B). 

Figure 7: Difference in the region 3121-3127 Kb between M. tuberculosis H37Rv (A) and M. microti OV254 (B£ 
Gray boxes picture the direct repeats (DR) , black ones the unique numbered spacer sequences^ spacer sequence 
identical to the one of spacer 58 reported by van Embden et at. (42). Note that spacers 33-36 and 20-22 are not 
shown because H37Rv lacks these spacers. 

Figure 8: A) Ase\ PFGE profiles of various M. microti strains; Hybridization with a radiolabeled B) es at- 6 probe, 
cforobe of the RD1™ flanking region; D) pIcA probe. 1. M. bovis AF21 22797, 2. M. canetti, 3. M. bovts BCG 
piS^r A M ^rcJi H37Rv 1 M. microti CV254, 6. M. microti Myc 94-2272, 7. M. microti B3 type -use, 
8. M. microtia type mouse , 9. M. microtia type llama, 10. M. microti^ type llama, 11. M. m/crof/ATCC 35782. 
M: Low range PFGE marker (NEB). 

Figure 9: PCR products obtained from various M. microti strains using primers that flank the RD1™ region, for 
amplifying ESAT-6 antigen, that flank the MiD2 region. 1 . M. microti B1 type llama, 2. M m.crot, B4 ^type mouse. 
I M microti B3 type mouse, 4. M. microti B2 type llama, 5. M. microti ATCC 35782, 6. M. m,crot, OV254, 7. M. 
microti Myc 94-2272, 8. M. tuberculosis H37Rv. 

Exam ple 1 : preparation and assessment of M. bovis BCG:: R D1 strains as a vaccine for treating or preventin. 
45 tuberculosis. 

[0045] As mentioned above, we have found that complementation with RD1 was accompanied by achange in colonial 
appearance as the BCG Pasteur "knock-in" strains developed a strikingly different morphotype (Fig. 2A). The , RD1 
complemented strains adopted a spreading, less-rugose morphology, that is characteristic of M. bov.s, and this ^was 
so more apparent when the colonies were inspected by light microscopy (Fig. 2B). Maps of the c ones used are s^ own 
(Fig 1 C) These changes were seen following complementation with all of the RD1 constructs (Fig. 1 C) and on com- 
plementing M. microti^* not shown). Pertinently, Calmette and Guerin (A. Calmette La 

la tuberculose. (Masson et cie., Paris, 1927)) observed a change in colony morphology during heir ,n, al paesag ng 
of M. bovis, and our experiments now demonstrate that this change, corresponding to .oss of RD1 , direct y contributed 
55 to attenuating this virulent strain. The integrity of the ceil wall is known to be a key virulence dete ™^ for 

culosis (C E Barry Trends Microbiol 9, 237 (2001)), and changes in both cell wall lipids (M. S. Gl.ckman, J. S. Cox, 
W F t Jacobs, Jr, lot CeiiS, 71 7 (2000)) and protein (F. X. Berthet, eta,., Science^ 759 (1 998)) have been shown 
to alter colony morphology and diminish persistence in animal models. 
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[0046] To determine which genes were implicated in these morphological changes, antibodies recognising three RD1 
proteins (Rv3873, CFP10 and ESAT-6) were used in immunocytological and subcellular fractionation analysis. When 
the different cell fractions from M. tuberculosis were immunoblotted all three proteins were localized in the cell wall 
fraction (Fig. 2C) though significant quantities of Rv3873, a PPE protein, were also detected in the membrane and 

5 cytosolic fractions (Fig. 2D). Using immunogold staining and electron microscopy the presence of ESAT-6 in the en- 
velope of M. tuberculosiswas confirmed but no alteration in capsular ultrastructure could be detected (data not shown). 
Previously, CFP-10 and ESAT-6 have been considered as secreted proteins (F. X. Berthet et al s Microbiology 1 44, 
3195 (1998)) but our results suggest that their biological functions are linked directly with the cell wall. 
[0047] Changes in colonial morphology are often accompanied by altered bacterial virulence. Initial assessment of 

10 the growth of different BCG::RD1 "knock-ins" in C57BIV6 or BALB/c mice following intravenous infection revealed that 
complementation did not restore levels of virulence to those of the reference strain M. tuberculosis H37Rv (Fig. 3A). 
In longer-term experiments, modest yet significant differences were detected in the persistence of the BCG::RD1 
"knock-ins" in comparison to BCG controls. Following intravenous infection of C57BU6 mice, only the RD1 "knock- 
ins" were still detectable in the lungs after 106 days (Fig. 3B). This difference in virulence between the RD1 recom- 

?5 binants and the BCG vector control was more pronounced in severe combined immunodeficiency (SCID) mice (Fig. 
3C). The BCG::RD1 "knock-in" was markedly more virulent, as evidenced by the growth rate in lungs and spleen and 
also by an increased degree of splenomegaly (Fig. 3D). Cytological examination revealed numerous bacilli, extensive 
cellular infiltration and granuloma formation. These increases in virulence following complementation with the RD1 
region, demonstrate that the loss of this genomic locus contributed to the attenuation of BCG. 

20 [0048] The inability to restore full virulence to BCG Pasteur was not due to instability of our constructs nor to the 
strain used (data not shown). Essentially identical results were obtained on complementing BCG Russia, a strain less 
passaged than BCG Pasteur and presumed, therefore, to be closer to the original ancestor (M. A. Behr, et at., Science 
284, 1520 (1999);. This indicates that the attenuation of BCG was a polymutational process and loss of residual viru- 
lence for animals was documented in the late 1920s (T. Oettinger, et al, Tuber Lung Dis 79, 243 (1999);. Using the 

25 same experimental strategy, we also tested the effects of complementing with RD3-5, RD7 and RD9 (S. T. Cole, et al., 
Nature 393, 537 (1998); M. A. Behr, et at., Science 284, 1520 (1999); R. Brosch, et a/., Infection Immun. 66, 2221 
(1998) and S. V. Gordon et al., Molec Microbiol 32, 643 (1999)) encoding putative virulence factors (Fig. 1 B). Repro- 
duction of these regions, which are not restricted to avirulent strains, did not affect virulence in immunocompetent mice. 
Although it is possible that deletion effects act synergistically it seems more plausible that other attenuating mechanisms 

30 are at play. 

[0049] Since RD1 encodes at least two potent T-cell antigens (H. Colangelli, et al., Infect Immun. 68, 990 (2000), 
M. Harboe, et al., Infect. Immun. 66, 71 7 (1 998) and R. L. V. SkjOt, etal., Infect. Immun, 68, 214 (2000)), we investigated 
whether its restoration induced immune responses to these antigens or even improved the protective capacity of BCG. 
Three weeks following either intravenous or subcutaneous inoculation with BCG::RD1 or BCG controls, we observed 

35 similar proliferation of splenocytes to ap Ag85A (an antigenic BCG protein) peptide (K. Huygen, et al., Infect. Immun. 
62, 363 (1 994)), but not against a control viral peptide (Fig. 4A). Moreover, BCG::RD1 generated powerful CD4 + T-cell 
responses against the ESAT-6 peptide as shown by splenocyte proliferation (Fig. 4A S B) and strong I FN--y production 
(Fig. 4C). In contrast, the BCG::pYUB412 control did not stimulate ESAT-6 specific T-cell responses thus indicating 
that these were mediated by the RD1 locus. ESAT-6 is, therefore, highly immunogenic in mice in the context of recom- 

40 binant BCG. 

[0050] When used as a subunit vaccine, ESAT-6 elicits T-cell responses and induces levels of protection weaker 
than but akin to those of BCG (L. Brandt et al t Infect. Immun. 68, 791 (2000);. Challenge experiments were conducted 
to determine if induction of immune responses to BCG::RD1-encoded antigens, such as ESAT-6, could improve pro- 
tection against infection with M. tuberculosis. Groups of mice inoculated with either BCG: :pYUB41 2 or BCG::RD1 were 
45 subsequently infected intravenously with M. tuberculosis H37Rv. These experiments showed that immunisation with 
the BCG::RD1 "knock-in" inhibited the growth of M. tuberculosis within both BALB/c (Fig. 4D) and C57BL/6 mice when 
compared to inoculation with BCG alone. 

[0051] Although the increases in protection induced by BCG::RD1 and the BCG control are modestthey demonstrate 
convincingly that genetic differences have developed between the live vaccine and the pathogen which have weakened 

50 the protective capacity of BCG. This study therefore defines the genetic basis of a compromise that has occurred, 
during the attenuation process, between loss of virulence and reduced protection (M. A. Behr, P. M. Small, A/afure389, 
1 33 (1 997)). The recombinant BCGs presented here may not be appropriate in their current form as vaccine candidates 
because of uncertainty about their safety. However, the strategy of reintroducing, or even overproducing (M. A. Horwitz 
et al, Proc Natl Acad Sci U S A 97, 1 3853 (2000);, the missing immunodominant antigens of M. tuberculosis in BCG, 

55 could be combined with an immuno-neutral attenuating mutation to create a more efficacious tuberculosis vaccine. 
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Example 2: BAC based comparative genomics identifies Mycobacterium microtias a natural ESAT-6 deletion 
mutant. 

[0052] We searched for any genetic differences between human and vole isolates that might explain their different 
5 degree of virulence and host preference and what makes the vole isolates harmless for humans. In this regard, com- 
parative genomics methods were employed in connection with the present invention to identify major differences that 
may exist between the M. microti reference strain OV254 and the entirely sequenced strains of M. tuberculosis H37Rv 
(1 0) or M. bovis AF21 22/97 (1 4). An ordered Bacteria! Artificial Chromosome (BAC) library of M. microti OV254 was 
constructed and individual BAC to BAC comparison of a minimal set of these clones with BAC clones from previously 
10 constructed libraries of M. tuberculosis H37Rv and M. bovis AF21 22/97 was undertaken. 

Ten regions were detected in M. microti that were different to the corresponding genomic regions in M. tuberculosis 
and M. bovis. To investigate if these regions were associated with the ability of M. microti strains to infect humans, 
their genetic organization was studied in 8 additional M. microti strains , including those isolated recently from patients 
with pulmonary tuberculosis. This analysis identified some regions that were specifically absent from all tested M. 
75 microti strains, but present in all other members of the M. tuberculosis complex and other regions that were only absent 
from vole isolates of M. microti. 

2.1 MATERIALS AND METHODS 

20 [0053] Bacterial strains and plasmids. M. microti OV254 which was originally isolated from voles in the UK in the 
1930's was kindly supplied by MJ Colston (45). DNA from M. microti OV216 and OV183 were included in a set of 
strains used during a multicenter study (26). M. microti Myc 94-2272 was isolated in 1 988 from the perfusion fluid of 
a 41 -year-old dialysis patient (43) and was kindly provided by L. M. Parsons. M. microti 3S7 82 was purchased from 
American Type Culture Collection (designation TMC 1608 (M.P. Prague)). M. microti&A type llama, B2 type llama, B3 

25 type mouse and B4 type mouse were obtained from the collection of the National Reference Center for Mycobacteria, 
Forschungszentrum Borstel, Germany. M. bovis strain AF21 22/97, spoligotype 9 was responsible for a herd outbreak 
in Devon in the UK and has been isolated from lesions in both cattle and badgers. Typically, mycobacteria were grown 
on 7H9 Middlebrook liquid medium (Difco) containing 10% oleic-acid-dextrose-catalase (Difco), 0.2 % pyruvic acid and 
0.05% Tween 80. 

30 [0054] Library construction, preparation of BAC DNA and sequencing reactions. Preparation of agarose- 
cmbedded genomic DNA from M. microti strain OV254, M. tuberculosis H37Rv, M. bovis BCG was performed as de- 
scribed by Brosch et al. (5). The M. microti library was constructed by ligation of partially digested Hind\\\ fragments 
(50-125 kb) into pBeloBACH . From the 10,000 clones that were obtained, 2,000 were picked into 96 well plates and 
stored at -80°C. Plasmid preparations of recombinant clones for sequencing reactions were obtained by pooling eight 

35 copies of 96 well plates, with each well,containing an overnight culture in 250 uJ 2YT medium with 12.5 jig. ml" 1 chlo- 
ramphenicol. After 5 min centrifugation at 3000 rpm, the bacterial pellets were resuspended in 25 jxl of solution A (25 
mM Tris, pH 8.0, 50 mM glucose and 10 mM EDTA), cells were lysed by adding 25 uJ of buffer B (NaOH 0.2 M, SDS 
0.2%). Then 20 uJ of cold 3 M sodium acetate pH 4.8 were added and kept on ice for 30 min. After centrifugation at 
3000 rpm for 30 min, the pooled supernatants (140 uJ) were transferred to new plates. 130 uJ of isopropanol were 

40 added, and after 30 min on ice, DNA was pelleted by centrifugation at 3500 rpm for 15 min. The supernatant was 
discarded and the pellet resuspended in 50 \i\ of a 10 u.g/ml RNAse A solution (in Tris 10 mM pH 7.5 /EDTA 10 mM) 
and incubated at 64°C for 15 min. After precipitation (2.5 uJ of sodium acetate 3 M pH 7 and 200 uJ of absolute ethanol) 
pellets were rinsed with 200 uJ of 70% ethanol, air dried and finally suspended in 20 uJ of TE buffer. 
[0055] End-sequencing reactions were performed with a Taq DyeDeoxy Terminator cycle sequencing kit (Applied 

45 Biosystems) using a mixture of 1 3 uJ of DNAsolution, 2 uJ of Primer (2 u.M) (SP6-BAC1 , AGTTAGCTC ACTC ATTAGG CA 
(SEQ ID No 7), or T7-BAC1 , GGATGTGCTGCAAGGCGATTA (SEQ ID No 8)), 2.5 uJ of Big Dye and 2.5 uJ of a 5X 
buffer (50 mM MgCI 2 50 mM Tris). Thermal cycling was performed on a PTC-100 amplifier (MJ Inc.) with an initial 
denaturation step of 60 s at 95°C, followed by 90 cycles of 15 s at 95°C, 15 s at 56°c, 4 min at 60°C. DNA was then 
precipitated with 80 uJ of 76% ethanol and centrifuged at 3000 rpm for 30 min. After discarding the supernatant, DNA 

50 was finally rinsed with 80 uJ of 70% ethanol and resuspended in appropriate buffers depending on the type of automated 
sequencer used (ABl 377 or ABI 3700). Sequence data were transferred to Digital workstations and edited using the 
TED software from the Staden package (37). Edited sequences were compared against the M. tuberculosis H37Rv 
database (http://genolist.pasteur.fr/TubercuList/), the M. bovis BLAST server (http://www.sanger.ac.uk/Projects/ 
M„bovis/blast_server.shtml ), and in-house databases to determine the relative positions of the M. microti O V254 BAC 

55 end-sequences. 

[0056] Preparation of BAC DNA from recombinants and BAC digestion profile comparison. DNA for digestion 
was prepared as previously described (4). DNA (1 jig) was digested with Hind\\\ (Boehringer) and restriction products 
separated by pulsed-field gel electrophoresis (PFGE) on a Biorad CHEF-DR III system using a 1% (w/v) agarose gel 
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and a pulse of 3.5 s for 17 h at 6 V.crrr 1 . Low-range PFGE markers (NEB) were used as size standards. Insert sizes 
were estimated after ethidium bromide staining and visualization with UV light. Different comparisons were made with 
overlapping clones from the M. microti OV254 ; M. bovis AF2 1 22/97, and M. tuberculosis H37Rv pBeloBAC11 libraries. 
[0057] PCR analysis to determine presence of genes in different M. microti strains. Reactions contained 5 uJ 
5 of 10xPCR buffer(100 mM £-mercaptoethanol, 600 mM Tris-HCl, pH 8.8, 20 mM MgCI 2 , 170 mM (NH 4 ) 2 S0 4 , 20 mM 
nucleotide mix dNTP), 2.5 ul of each primer at 2 uM, 1 0 ng of template DNA 5 1 0% DMSO and 0.5 unit of Taq polymerase 
in a final volume of 12.5 ui Thermal cycling was performed on a PTC-1 00 amplifier (MJ Inc.) with an initial denaturation 
step of 90 s at 95°C, followed by 35 cycles of 45 s at 95°C, 1 min at 60°C and 2 min at 72°C. 

[0058] RFLP analysis. In brief, agarose plugs of genomic DNA prepared as previously described (5) were digested 
10 with either Ase\, Dra\ or Xba\ (NEB), then electrophoresed on a 1% agarose gel, and finally transferred to Hybond-C 

extra nitrocellulose membranes (Amersham). Different probes were amplified by PCR from the M. microti strain OV254 

or M. tuberculosis H37Rv using primers for: 

esat-6 (esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9); 

esat-6R ATCCCAGTGACGTTGCCTT (SEQ ID No 1 0), 
15 the RD1 mic flanking region (4340, 209F GCAGTGCAAAGGTGCAGATA (SEQ ID No 11); 4354, 701 R GATTGA- 

GACACTTGCCACGA (SEQ ID No 12)), or 

plcA (pIcA.int.F CAAGTTGGGTCTGGTCGAAT (SEQ ID No 13); plcA.int.R GCTACCCAAGGTCTCCTGGT (SEQ ID 
No 14)). Amplification products were radio-labeled by using the Stratagene Prime-It II kit (Stratagene). Hybridizations 
were performed at 65°C in a solution containing NaCI 0.8 M, EDTA pH 8, 5 mM, sodium phosphate 50 mM pH 8, 2% 
20 SDS, 1X Denhardt's reagent and 100 u.g/ml salmon sperm DNA (Genaxis). Membranes were exposed to phosphorim- 
ager screens and images were digitalized by using a STORM phospho-imager. 

DNA sequence accession numbers. The nucleotide sequences that flank MiD1 , MiD2, MiD3 as well as the junction 
sequence of RD1 mic have been deposited in the EMBL database. Accession numbers are AJ345005, AJ345006, 
AJ3 15556 and AJ3 15557, respectively 

25 

2.2 RESULTS 

[0059] Establishment of a complete ordered BAC library of M. microt i OV254. Electroporation of pBeloBACH 
containing partial HindiW digests of M. microti OV254 DNA into Escherichia coii DH1 0B yielded about 10,000 recom- 

30 binant clones, from which 2,000 were isolated and stored in 96-well plates. Using the complete sequence of the M. 
tuberculosis H37Rv genome as a scaffold, end-sequencing of 384 randomly chosen M. microti BAC clones allowed 
us to select enough clones to cover almost all of the 4.4 Mb chromosome. A few rare clones that spanned regions that 
were not covered by this approach were identified by PCR screening of pools as previously described (4). This resulted 
in a minimal set of 50 BACs, covering over 99.9% of the M. microti OV254 genome, whose positions relative to M. 

35 tuberculosis H37Rv are shown in Figure 5. The insert size ranged between 50 and 125 kb, and the recombinant clones 
were stable. Compared with other BAC libraries from tubercle bacilli (4 } 1 3) the M. microti OV254 BAC library contained 
clones that were generally larger than those obtained previously, which facilitated the comparative genomics approach, 
described below. 

[0060] Identification of DNA deletions in M. microti OV254 relative to M. tuberculosis H37Rv by comparative 

40 genomics. The minimal overlapping set of 50 BAC clones, together with the availability of three other ordered BAC 
libraries from M. tuberculosis H37Rv, M. bovis BCG Pasteur 1173P2 (5, 13) and M. bovis AF21 22/97 (14) allowed us 
to carry out direct BAC to BAC comparison of clones spanning the same genomic regions. Size differences of PFGE- 
separated HindlU restriction fragments from M. microti OV254 BACs, relative to restriction fragments from M. bovis 
and/or M. tuberculosis BAC clones, identified loci that differed among the tested strains. Size variations of at least 2 

45 kb were easily detectable and 10 deleted regions, evenly distributed around the genome, and containing more than 
60 open reading frames (ORFs), were identified. These regions. represent over 60 kb that are missing from M. microti 
OV254 strain compared to M. tuberculosis H37Rv. First, it was found that phi Rv2 (RD11), one of the two M. tuberculosis 
H37Rv prophages was present in M. microti OV254, whereas phiRvl , also referred to as RD3 (29) was absent. Second, 
it was found that M. microti lacks four of the genomic regions that were also absent from M. bovis BCG. In fact, these 

50 four regions of difference named RD7, RD8, RD9and RD1 0 are absent from all members of the M. tuberculosis complex 
with the exception of M. tuberculosis and M. canettii, and seem to have been lost from a common progenitor strain of 
M. africanum, M. microti and M. bovis (3). As such, our results obtained with individual BAC to BAC comparisons show 
that M. microti \s part of this non-M. tuberculosis lineage of the tubercle bacilli, and this assumption was further confirmed 
by sequencing the junction regions of RD7 - RD10 in M. microti OV254. The sequences obtained were identical to 

55 those from M. africanum, M. bovis and M. bovis BCG strains. Apart from these four conserved regions of difference, 
and phiRvl (RD3) M. microti OV254 did not show any other RDs with identical junction regions to M. bovis BCG Pasteur, 
which misses at least 17 RDs relative to M. tuberculosis H37Rv (1, 13, 35). However, five other regions missing from 
the genome of M. microti O V254 relative to M. tuberculosis H37Rv were identified (RD1 mlc , RD5 mjc , MiD1 , MiD2, MiD3). 
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Such regions are specific either for strain OV254 orfor M. microti strains in general. Interestingly, two of these regions, 
RD1 mic t RD5 mic partially overlap RDs from the M. bovis BCG. 

[0061 ] Antigens ESAT-6 and CFP-1 0 are absent from M. microti. One of the most interesting findings of the BAC 
to BAC comparison was a novel deletion in a genomic region close to the origin of replication (figure 5). Detailed PCR 
and sequence analysis of this region in M. microti OV254 showed a segment of 14 kb to be missing (equivalent to M. 
tubercutosisH37Rv from 4340,4 to 4354,5 kb) that partly overlapped RD1 bc 9 absent from M. bovis BCG. More precisely, 
ORFs Rv3864 and Rv3876 are truncated in M. microti OV254 and ORFs Rv3865 to Rv3875 are absent (figure 6). This 
observation is particularly interesting as previous comparative genomic analysis identified RD1 bc s as the only RD 
region that is specifically absent from all BCG sub-strains but present in all other members of the M. tuberculosis 
complex (1. 4, 13, 29, 35). As shown in Figure 6, in M. microti OV 254 the RD1 mic deletion is responsible for the loss 
of a large portion of the conserved ESAT-6 family core region (40) including the genes coding for the major T-cell 
antigens ESAT-6 and CFP-1 0 (2, 1 5). The fact that previous deletion screening protocols employed primer sequences 
that were designed for the right hand portion of the RD1 bc 9 region (i.e. gene Rv3878) (6, 39) explains why the RD1 mlc 
deletion was not detected earlier by these investigations. Figure 6 shows that RD1 mlc does not affect genes Rv3877, 
Rv3878 and Rv3879 which are part of the RDI^s deletion. 

[0062] Deletion of phospholipase-C genes in M. microti OV254. RD5 mic , the other region absent from M. microti 
OV254, that partially overlapped an RD region from BCG, was revealed by comparison of BAC clone Mil 8A5 with BAC 
Rv143 (figure 5). PCR analysis and sequencing of the junction region revealed that RD5 mic was smallerthan the RD5 
deletion in BCG (Table 1 and 2 below). 



TABLE 1 



Description of the putative function of the deleted and truncated ORFs in M. microti OV '254 





Region 


Start - End 


overlapping ORF 


Putative Function or family 


25 


RD 10 


264,5-266,5 


Rv0221-Rv0223 


echAl 




RD 3 


1779,5-1788,5 


Rv1573-Rv1586 


bacteriophage proteins 




RD 7 


2207,5-2220,5 


Rv1964-Rv1977 


yrbE3A-3B; mce3A-F; unknown 




RD 9 


2330-2332 


Rv2072-Rv2075 


cobL; probable oxidoreductase; unknown 


30 


RD5 mic 


2627,6-2633,4 


Rv2348-Rv2352 


pic A-C] member of PPE family 


MiD1 


3121,8-3126,6 


Rv2816-Rv2819 


IS67 70transposase; unknown 




MiD2 


3554,0-3755,2 


Rv3187-Rv3190 


IS6?f0transposase; unknown 




MiD3 


3741,1-3755,7 


Rv3345-Rv3349 


members of the PE-PGRS and PPE families; insertion elements 




RD8 


4056,8-4062,7 


Rv3617-Rv3618 


ephA\ IpqG, member of the PE-PGRS family 


35 


RD1 mic 


4340,4-4354,5 


Rv3864rRv3876 


member of the CBXX/CF QX family; member of the PE and PPE 






families; ESAT-6; CFP10; unknown 



40 



45 



TABLE 2. Sequence at the junction of the deleted regions in M. micron' OV254 



50 



Junction 


Position 


ORFs 


Sequences at the junction 


Flanking primers 


RDl mic 
(SEQ ID 
No 15) 


4340,421- 
4354,533 


Rv3864- 
Rv3876 


CAACACGAGGTTGTAAAACCTCGACG 
CAGGATCGGCGATGAAATGCCAGTCC 
GCGTCGCTGAGCGCGCGCTGCGCCOl 
GTCCCA TTTTGTCGCTGA TTTGTTTGAA CA 
GCGACGAA CCG G TG TTGAA A A TCTCOCCT 
GGGTCGGGGA TTCCCT 


4340,209F (SEQ ID No 11) 
GCAGTGC AAAGGTGC A G ATA 

4354,70 1 R (SEQ ID No 12) 
GATTGAGACACTTGCCACGA 


RD5 n " c 
(SEQ ID 
No 1 8 j 


2627,831- 
2635,581 


Rv2349- 
Rv2355 


CCTCGATGAACCACCTGACATGACCC 
CATCCTTTCCAAGAACTGGAGTCTCC 
CGACATCCCCCGCCCCTTCACT'GCCC 
CA GGTGTCCTGGGTCGTTCCGTTGA CCGT 


2627,370P (SEQ ID No 16) 
GAATGCCCACGTCATATCG 
2633.692R (SEQ ID No 17) 



55 
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CGA GTCCGAA CA TCCGTCA TTCCCGGTGG 
CAGTCGCTGCGGTGAC 



CGGCCACTGAGTTCGATTAT 



MiDl 
(SEQ ID 
No2l) 



312KS80- Rv2815c- 
3126 s 684 Rv283 8c 



MiD2 
(SEQ ID 
No 24) 



3554,066- Rv31S8- 
3555,259 R\3189 



MiD3 
(SEQ ID 
No 27) 



3741,139- Rv3345c- 
3755,777 Rv3349c 



CACCTGACATGACCCCATCCTTTCCA 
AGAACTGGAGTCTCCCGACATGCCGG 
GGCGCTTC AG GGA CA TTCA TGTCCA TCTT 
CTGGCAGA TCA GCA GA TCGCTTGTTCTCA G 
TGC4GGTGAGTC 



312l,690F(SEQ ID No 19) 
CAGCCAACACCAAGTAGACG 

3126,924R (SEQ ID No 20) 
TCT ACCTGCAGTCGCTTGTG 



GCTGCCTACTACGCTCAACGCCAGAG 
ACCACCCGCCGGCTGAGGTCTCAGAT 
CAGAGAGTCTCCGGACTCACCGGGGC 
GGTTCATAAA GGCTTCGA GA CCGGA CGG 
GCTGTA GG TTCCTCAA CTGTGTG GCGGA T 
GG TCTGA GCA CTTAA C 



3553,8S0F (SEQ ID No 22) 
GTCCATCGAGGATGTCGAGT 

3555J85R (SEQ ID No 23) 
CTAGGCCATTCCGTTGTCTC 



20 



TGGCGCCGGCACCTCCGTTGCCACCC 
TTGCCGCCGCTGCTGGGCGCGGTCCC 
GTTCGCCCCGGCCGAACCGTTCAGGG 
CCGGGTTCGCCCrC/f GCCGCTAAA CA CG 
CCGA CCA A GA TCA A CGA GCTA CCTGCCCG 
GTCAAGGTTGAAGAGCCCCCA TA TCAGCA 
AGGGCCCGGTGTCGGCG 



3740.950F (SEQ ID No 25) 

GGCGACGCCATTTCC 
3755,9S8R (SEQ ID No 26) 
AA CTGTCGGGCTTG CTCTT 



[0063] In fact, M. microti OV254 lacks the genes pIcA, pfcB, plcC and two specific PPE-protein encoding genes 
(Rv2352, Rv2353). This was confirmed by the absence of a clear band on a Southern blot of Ase\ digested genomic 

25 DNA from M. microti OV '254 hybridized with a plcA probe. However, the genes Rv2346c and Rv2347c, members of 
the esa/-6family, and Rv2348c, that arc missing from M. bows and BCG strains (3) are still present in M. microti OV254. 
The presence of an \S61 10 element in this segment suggests that recombination between two \S61 10 elements could 
have been involved in the loss of RD5 mic , and this is supported by the finding that the remaining copy of \S6110 does 
not show a 3 base-pair direct repeat in strain OV254 (Table 2). 

30 [0064] LackofMiDI provides genomic clue for M. microti 0\f 254 characteristic spoligotype. MiD1 encompass- 
es the three ORFs Rv281 6, Rv281 7 and Rv281 8 that encode putative proteins whose functions are yet unknown, and 
has occurred in the direct repeat region (DR), a polymorphic locus in the genomes of the tubercle bacilli that contains 
a cluster of direct repeats of 36 bp, separated by unique spacer sequences of 36 to 41 bp (1 7), (figure 7). The presence 
or absence of 43 unique spacer sequences that intercalate the DR sequences is the basis of spacer-oligo typing, a 

35 powerful typing method for strains frorn, the M. tuberculosis complex (23). M. microti isolates exhibit a characteristic 
spoligotype with an unusually small DR cluster, due to the presence of only spacers 37 and 38 (43). In M. microti 
OV254, the absence of spacers 1 to 36, which are present in many other M. tuberculosis complex strains, appears to 
result from an \S61 10 mediated deletion of 636 bp of the DR region. Amplification and PvuW restriction analysis of a 
2.8 kb fragment obtained with primers located in the genes that flank the DR region (Rv2813c and Rv2819) showed 

40 that there is only one copy of \S6110 remaining in this region (figure 7). This \S6110 element is inserted into ORF 
Rv2819 at position 3,119,932 relative to the M. tuberculosis H37Rv genome. As for other \S61 10 elements that result 
from homologous recombination between two copies (7), no 3 base-pair direct repeat was found for this copy of \S61 10 
in the DR region. Concerning the absence of spacers 39-43 (figure7), it was found that M. microti showed a slightly 
different organization of this locus than M. bovis strains, which also characteristically lack spacers 39-43. In M. microti 

*s OV254an extra spacer of 36 bp was found that was not present in M. bovis nor in M. tuberculosis \-\37H v. The sequence 
of this specific spacer was identical to that of spacer 58 reported by van Embden and colleagues (42). In their study 
of the DR region in many strains from the M. tuberculosis complex this spacer was only found in M. microti strain 
NLA000016240 (AF1 89828) and in some ancestral M. tuberculosis strains (3, 42). Like MiD1, MiD2 most probably 
results from an \S61 1 0-mediated deletion of two genes (Rv31 88, Rv31 89) that encode putative proteins whose function 

50 js unknown (Table 2 above and Table 3 below). 



TABLE 3. 



Presence of the RD and MiD regions in different M. microti strains 


HOST 


VOLES 




HUMAN 


Strain 


OV254 | OV183 | OV216 \ ATCC 


Myc 94 


B3 | B4type | B1 


| B2 
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TABLE 3. (continued) 



Presence of the RD and MiD regions in different M. microti strains 


HOST 


VOLES 


HUMAN 












0C70O 
OO / Ozl 


-2272 


tvoe 
mouse 


mouse 


type llama 


type llama 


RD-jmic 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD 3 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD 7 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD8 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD 9 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD 10 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


MiD3 


absent 


ND 


ND 


absent 


absent 


absent 


absent 


absent 


absent 


MIDI 


absent 


ND 


ND 


present 


partial 


partial 


partial 


present 


present 


RD5 mlc 


absent 


absent 


absent 


present 


present 


present 


present 


present 


present 


MiD2 


absent 


ND 


ND 


present 


present 


present 


present 


present 


present 


ND, r 


lot determined 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



[0065] Absence of some members of the PPE family in M. microti. MiD3 was identified by the absence of two 
Hind\\\ sites in BAC Mi4B9 that exist at positions 3749 kb and 3754 kb in the M, tuberculosis H37Rv chromosome. By 
PCR and sequence analysis, it was determined that MiD3 corresponds to a 1 2 kb deletion that has truncated or removed 
five genes orthologous to Rv3345c-Rv3349c. Rv3347c encodes a protein of 3157 aminoacids that belongs to the PPE 
family and Rv3346c a conserved protein that is also present in M. leprae. The function of both these putative proteins 
is unknown while Rv3348 and Rv3349 are part of an insertion element (Table 1). At present, the consequences of the 
MiD3 deletions for the biology of M. microti remains entirely unknown. 

[0066] Extra-DNA in M. microti OV254 relative to M. tuberculosis H37Rv. M. microti OV254 possesses the 6 
regions RvD1 to RvD5 and TBDI that are absent from the sequenced strain M. tuberculosis H37Rv, but which have 
been shown to be present in other members of the M. tuberculosis complex, like M. canettii, M. africanum, M. bovis, 
and M. bovis BCG (3, 7. 13). In M. tuberculosis H37Rv, four of these regions (RvD2-5) contain a copy of \S61 10 which 
is not flanked by a direct repeat, suggesting that recombination of two \S61 10 elements was involved in the deletion 
of the intervening genomic regions (7). In consequence, it seems plausible that these regions were deleted from the 
M. tuberculosis H37Rv genome rather than specifically acquired by M. microti In addition, three other small insertions 
have also been found and they are due to the presence of an \S61 10 element in a different location than in M. tuber- 
culosis H37Rv and M. bovis AF2122/97 ' Indeed, Pvull RFLP analysis of M. microtiOV254 reveals 13 \S61 10 elements 
(data not shown). 

[0067] Genomic diversity of M. microti strains. In order to obtain a more global picture of the genetic organization 
of the taxon M. microti we evaluated the presence or absence of the variable regions found in strain OV254 in eight 
other M. microti strains. These strains which were isolated from humans and voles have been designated as M. microti 
mainly on the basis of their specific spoligotype (26, 32, 43) and can be further divided into subgroups according to 
the host such as voles, llama and humans (Table 3). As stated in the introduction, M. microti is rarely found in humans 
unlike M. tuberculosis. So the availability of 9 strains from variable sources for genetic characterization is an exceptional 
resource. Among them was one strain (Myc 94-2272) from a severely immunocompromised individual (43), and four 
strains were isolated from HIV-positive or HIV-negative humans with spoligotypes typical of llama and mouse isolates. 
For one strain, ATCC 35872 / M.P. Prague, we could not identify with certainty the original host from which the strain 
was isolated, nor if this strain corresponds to M. microti OV1 66, that was received by Dr. Sula from Dr. Wells and used 
thereafter for the vaccination program in Prague in the 1960's (38). 

[0068] First, we were interested if these nine strains designated as M. microti on the basis of their spoligotypes also 
resembled each other by other molecular typing criteria. As RFLP of pulsed-field ge! separated chromosomal DNA 
represents probably the most accurate molecular typing strategy for bacterial isolates, we determined the Asel profiles 
of the available M. microti strains, and found that the profiles resembled each other closely but differed significantly 
from the macro-restriction patterns of M. tuberculosis, M. bovis and M bovis BCG strains used as controls. However, 
as depicted in Figure 8A, the patterns were not identical to each other and each M. microti strain showed subtle dif- 
ferences, suggesting that they were not epidemiological^ related. A similar observation was made with other rare 
cutting restriction enzymes, like Oral or Xba\ (data not shown). 

[0069] Common and diverging features of M microti strains. Two strategies were used to test for the presence 
or absence of variable regions in these strains for which we do not have ordered BAC libraries. First, PCRs using 
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internal and flanking primers of the variable regions were employed and amplification products of the junction regions 
were sequenced. Second, probes from the internal portion, of variable regions absent from M. microti OV254 were 
obtained by amplification of M. tuberculosis H37Rv DNA using specific primers. Hybridization with these radio-labeled 
probes was carried out on blots from PFGE separated Ase\ restriction digests of the M. microti strains. In addition, we 
5 confirmed the findings obtained by these two techniques by using a focused macro-array, containing some of the genes 
identified in variable regions of the tubercle bacilli to date (data not shown). 

[0070] This led to the finding that the RD1 mlc deletion is specific for all M. microti strains tested. 

Indeed, none of the M. microti DNA-digests hybridized with the radio-labeled esat-6 probe (Fig. BB) but with the RD1 mic 
flanking region (Fig. 8C). In addition, PCR amplification using primers flanking the RD1 m5c region (Table 2) yielded 

10 fragments of the same size for M. microti strains whereas no products were obtained for M. tuberculosis, M. bovis and 
M. bovis BCG strains (Fig. 9). Furthermore, the sequence of the junction region was found identical among the strains 
which confirms that the genomic organization of the RD1 mic locus was the same in all tested M. microti strains (Table 
3). This clearly demonstrates that M. microti lacks the conserved ESAT-6 family core region stretching in other members 
of the M. tuberculosis complex from Rv3864 to Rv3876 and, as such, represents a taxon of naturally occurring ESAT- 

15 6 / CFP-1 0 deletion mutants. 

[0071] Like RD1 mlc , MiD3 was found to be absent from all nine M. microti strains tested and, therefore, appears to 
be a specific genetic marker that is restricted to M. microti strains (Table 3). However, PCR amplification showed that 
RD5 mic is absent only from the vole isolates OV254, OV216 and OV183, but present in the M. microti strains isolated 
from human and other origins (Table 3). This was confirmed by the presence of single bands but of differing sizes on 

20 a Southern blot hybridized with a pIcA probe for all M. microti tested strains except OV254 (Fig. 8D). Interestingly, the 
presence or absence of RD5 mic correlated with the similarity of \S61 10 RFLP profiles. The profiles of the three M. 
microti strains isolated from voles in the UK differed considerably from the IS6110 RFLP patterns of humans isolates 
(43). Taken together, these results underline the proposed involvement of \S61 70 mediated deletion of the RD5 region 
and further suggest that RD5 may be involved in the variable potential of M. microti strains to cause disease in humans. 

25 Similarly, it was found that MiD1 was missing only from the vole isolates OV254, OV216 and OV183, which display 
the same spoligotype (43), confirming the observations that MiD1 confers the particular spoligotype of a group of M. 
microti strains isolated from voles. In contrast, PCR analysis revealed that MiD1 is only partially deleted from strains 
B3 and B4 both characterized by the mouse spoligotype and the human isolate M. microti Myc 94-2272 (Table 3). For 
strain ATCC 35782 deletion of the MiD1 region was not observed. These findings correlate with the described spoli- 

30 gotypes of the different isolates, as strains that had intact or partially deleted MiD1 regions had more spacers present 
than the vole isolates that only showed spacers 37 and 38. 

2.3 COMMENTS AND DISCUSSION 

35 [0072] We have searched for major genomic variations, due to insertion-deletion events, between the vole pathogen, 
M. microti, and the human pathogen, M. tuberculosis. BAC based comparative genomics led to the identification of 10 
regions absent from the genome of the vole bacillus M. microti OV254 and several insertions due to \S6i 10. Seven of 
these deletion regions were also absent from eight other M. microti strains, isolated from voles or humans, and they 
account for more than 60 kb of genomic DNA. Of these regions, RD1 mic is of particular interest, because absence of 

40 part of this region has been found to be restricted to the BCG vaccine strains to date. As M. microti was originally 
described as non pathogenic for humans, it is proposed here that RD1 genes is involved in the pathogenicity for humans. 
This is reinforced by the fact that RD1 bc 9 (29) has lost putative ORFs belonging to the esat-6 gene cluster including 
the genes encoding ESAT-6 and CFP-1 0 (Fig. 6) (40). Both polypeptides have been shown to act as potent stimulators 
of the immune system and arc antigens recognized during the early stages of infection (8, 12, 20, 34). Moreover, the 

45 biological importance of this RD1 region for mycobacteria is underlined by the fact that it is also conserved in M. leprae, 
where genes ML0047-ML0056 show high similarities in their sequence and operon organization to the genes in the 
esat-6 core region of the tubercle bacilli (11). In spite of the radical gene decay observed in M. lepraelhe esat-6 operon 
apparently has kept its functionality in this organism. 

[0073] However, the RD1 deletion may not be the only reason why the vole bacillus is attenuated for humans. Indeed, 
50 it remains unclear why certain M. microti strains included in the present study that show exactly the same RD1 m,c 
deletion as vole isolates, have been found as causative agents of human tuberculosis. As human M. microti cases are 
extremely rare, the most plausible explanation for this phenomenon would be that the infected people were particularly 
susceptible for mycobacterial infections in general. This could have been due to an immunodeficiency (32, 43) or to a 
rare genetic host predisposition such as interferon gamma- or IL-12 receptor modification (22). 
55 [0074] In addition, the finding that human M. microti isolates differed from vole isolates by the presence of region 
RD5 mic may also have an impact on the increased potential of human M. microti isolates to cause disease. Intriguingly, 
BCG and the vole bacillus lack overlapping portions of this chromosomal region that encompasses three {pIcA, pIcB, 
pIcC) of the four genes encoding phospholipase C (PLC) in M. tuberculosis. PLC has been recognized as an important 
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virulence factor in numerous bacteria, including Clostridium perfringens, Listeria monocytogenes and Pseudomonas 
aeruginosa, where it plays a role in cell to cell spread of bacteria, intracellular survival, and cytolysis (36, 41 ). To date, 
the exact role of PLC for the tubercle bacilli remains unclear. pIcA encodes the antigen mtp40 which has previously 
been shown to be absent from seven tested vole and hyrax isolates (28). Phospholipase C activity in M. tuberculosis, 

5 M microti and M. bovis, but not in M. bovis BCG, has been reported (21 , 47). However, PLC and sphingomyelinase 
activities have been found associated with the most virulent mycobacterial species (21). The levels of phospholipase 
C activity detected in M. boviswere much lower than those seen in M. tuberculosis consistent with the loss of pIcABC. 
It is likely, that pIcD is responsible for the residual phospholipase C activity in strains lacking RD5, such as M. bovis 
and M. microti OV254. Indeed, the pIcD gene is located in region RvD2 which is present in some but not all tubercle 

to bacilli (13, 18). Phospholipase encoding genes have been recognized as hotspots for integration of \S6110 and it 
appears that the regions RD5 and RvD2 undergo independent deletion processes more frequently than any other 
genomic regions (44). Thus, the virulence of some M. microti strains may be due to a combination of functional phos- 
pholipase C encoding genes (7, 25, 26, 29). 

[0075] Another intriguing detail revealed by this study is that among the deleted genes seven code for members of 

15 the PPE family of Gly- Ala-, Asn-rich proteins. A closer look at the sequences of these genes showed that in some 
cases they were small proteins with unique sequences, like for example Rv3873, located in the RD1™ region, or 
Rv2352c and Rv2353c located in the RD5™ region. Others, like Rv3347c, located in the MiD3 region code for a much 
larger PPE protein (3157 aa). In this case a neighboring gene (Rv3345c), belonging to another multigene family, the 
PE-PGRS family, was partly affected by the MiD3 deletion. While the function of the PE/PPE proteins is currently 

20 unknown, their predicted abundance in the proteome of M. tuberculosis suggests that they may play an important role 
in the life cycle of the tubercle bacilli. Indeed, recently some of them were shown to be involved in the pathogenicity 
of M. tuberculosis strains (9). Complementation of such genomic regions in M. microti OV254 should enable us to carry 
out proteomics and virulence studies in animals in order to understand the role of such ORFs in pathogenesis. 
[0076] In conclusion, this study has shown that M. microti, a taxon originally named after its major host Microtus 

25 agrestis the common vole, represents a relatively homogenous group of tubercle bacilli. Although all tested strains 
showed unique PFGE macro-restriction patterns that differed slightly among each other, deletions that were common 
to all M. microti isolates (RD7-RD10, MiD3, RD1™) have been identified. The conserved nature of these deletions 
suggests that these strains are derived from a common precursor that has lost these regions, and their loss may 
account for some of the observed common phenotypic properties of M. microti, like the very slow growth on solid media 

30 and the formation of tiny colonies. This finding is consistent with results from a recent study that showed that M. microti 
strains carry a particular mutation in the gyrB gene (31). fcJl . M v 
[0077] Of particular interest, some of these common features (e.g. the flanking regions of RD1 m ' c , or WliD3) 
could be exploited for an easy-to- perform PCR identification test, similar to the one proposed for a range of 
tubercle bacilli (33). This test enables unambiguous and rapid identification of M. microti isolates in order to obtain a 

35 better estimate of the overall rate of M* microti infections in humans and other mammalian species. 
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SEQUENCE LISTING PROVISOIKE 
RD1-AP34 (a 3909 bp fragment of the M. tuberculosis H37Rv genome) 

gaattcccatccagtgagttcaaggtcaagcggcgcccccctggccaggcatttctcgtctcgccagacggcaaa 

gaggtcatccaggccccctacatcgagcctccagaagaagtgttcgcagcacccccaagcgccggttaagattat 

ttcattgccggtgtagcaggacccgagctcagcccggtaatcgagttcgggcaatgctgaccatcgggtttgttt 

ccggctataaccgaacggtttgtgtacgggatacaaatacagggagggaagaagtaggcaaatggaaaaaatgtc 

acatgatccgatcgctgccgacattggcacgcaagtgagcgacaacgctctgcacggcgtgacggccggctcgac 

ggcgctgacgtcggtgaccgggctggttcccgcgggggccgatgaggtctccgcccaagcggcgacggcgttcac 

atcggagggcatccaattgctggcttccaatgcatcggcccaagaccagctccaccgtgcgggcgaagcggtcca 

ggacgtcgcccgca^ctattcgcaaatcgacgacggcgccgccggcgtcttcgccgaataggcccccaacacatc 

ggagggagtgat caeca tgctgtggcacgcaatgccaccggagctaaataccgcacggct gat ggccggcgcggg 

tccggctccaatgcttgcggcggccgcgggatggcagacgctttcggcggctctggacgctcaggccgtcgagtt 

gaccgcgcgcctgaactctctgggagaagcctggactggaggtggcagcgacaaggcgcttgcggctgcaacgcc 

gatggtggtctggctacaaaccgcgtcaacacaggccaagacccgtgcgatgcaggcgacggcgcaagccgcggc 

atacacccaggccatggccacgacgccgtcgctgccggagatcgccgccaaccacatcacccaggccgtccttac 

ggccaccaacttcttcggtatcaacacgatcccgatcgcgttgaccgagatggattatttcatccgtatgtggaa 

ccaggcagccctggcaatggaggtctaccaggccgagaccgcggttaacacgcttttcgagaagctcgagccgat 

ggcgtcgatccttgatcccggcgcgagccagagcacgacgaacccgatcttcggaatgccctcccctggcagctc 

aacaccggttggccagttgccgccggcggctacccagaccctcggccaactgggtgagatgagcggcccgatgca 

gcagctgacccagccgctgcagcaggtgacgtcgttgttcagccaggtgggcggcaccggcggcggcaacccagc 

cgacgaggaagccgcgcagatgggcctgctcggcaccagtccgctgtcgaaccatccgctggctggtggatcagg 

ccccagcgcgggcgcgggcctgctgcgcgcggagtcgctacctggcgcaggtgggtcgttgacccgcacgccgct 

gatgtctcagctgatcgaaaagccggttgccccctcggtgatgccggcggctgctgccggatcgtcggcgacggg 

tggcgccgctccggtgggtgcgggagcgatgggccagggtgcgcaatccggcggctccaccaggccgggtctggt 

cgcgccggcaccgctcgcgcaggagcgtgaagaagacgacgaggacgactgggacgaagaggacgactggtgagc 

tcccgtaatgacaacagacttcccggccacccgggccggaagacttgccaacattttggcgaggaaggtaaagag 

agaaagtagtccagcatggcagagatgaagaccgatgccgctaccctcgcgcaggaggcaggtaatttcgagcgg 

atctccggcgacctgaaaacccagatcgaccaggtggagtcgacggcaggttcgttgcagggccagtggcgcggc 

gcggcggggacggccgcccaggccgcggtggtgcgcttccaagaagcagccaataagcagaagcaggaactcgac 

gagatctcgacgaatattcgtcaggccggcgtccaatactcgagggccgacgaggagcagcagcaggcgctgtcc 

tcgcaaatgggcttctgacccgctaatacgaaaagaaacggagcaaaaacatgacagagcagcagtggaatttcg 

cgggtatcgaggccgcggcaagcgcaatccagggaaatgtcacgtccattcattccctccttgacgaggggaagc 

agtccctgaccaagctcgcagcggcctggggcggtagcggttcggaggcgtaccagggtgtccagcaaaaatggg 

acgccacggctaccgagctgaacaacgcgctgcagaacctggcgcggacgatcagcgaagccggtcaggcaatgg 

cttcgaccgaaggcaacgtcactgggatgttcgcatagggcaacgccgagttcgcgtagaatagcgaaacacggg 

atcgggcgagttcgaccttccgtcggtctcgccctttctcgtgtttatacgtttgagcgcactctgagaggttgt 

catggcggccgactacgacaagctcttccggccgcacgaaggtatggaagctccggacgatatggcagcgcagcc 

gttcttcgaccccagtgcttcgtttccgccggcgcccgcatcggcaaacctaccgaagcccaacggccagactcc 

gcccccgacgtccgacgacctgtcggagcggttcgtgtcggccccgccgccgccacccccacccccacctccgcc 

tccgccaactccgatgccgatcgccgcaggagagccgccctcgccggaaccggccgcatctaaaccacccacacc 

ccccatgcccatcgccggacccgaaccggccccacccaaaccacccacaccccccatgcccatcgccggacccga 

accggccccacccaaaccacccacacctccgatgcccatcgccggacctgcacccaccccaaccgaatcccagtt 

ggcgccccccagaccaccgacaccacaaacgccaaccggagcgccgcagcaaccggaatcaccggcgccccacgt 

accctcgcacgggccacatcaaccccggcgcaccgcaccagcaccgccctgggcaaagatgccaatcggcgaacc 

cccgcccgctccgtccagaccgtctgcgtccccggccgaaccaccgacccggcctgccccccaacactcccgacg 

tgcgcgccggggtcaccgctatcgcacagacaccgaacgaaacgtcgggaaggtagcaactggtccatccatcca 

ggcgcggctgcgggcagaggaagcatccggcgcgcagctcgcccccggaacggagccctcgccagcgccgttggg 

ccaaccgagatcgtatctggctccgcccacccgccccgcgccgacagaacctccccccagcccctcgccgcagcg 

caactccggtcggcgtgccgagcgacgcgtccaccccgatttagccgcccaacatgccgcggcgcaacctgattc 

aattacggccgcaaccactggcggtcgtcgccgcaagcgtgcagcgccggatctcgacgcgacacagaaatcctt 

aaggccggcggccaaggggccgaaggtgaagaaggtgaagccccagaaaccgaaggccacgaagccgcccaaagt 

ggtgtcgcagcgcggctggcgacattgggtgcatgcgttgacgcgaatcaacctgggcctgtcacccgacgagaa 

gtacgagctggacctgcacgctcgagtccgccgcaatccccgcgggtcgtatcagatcgccgtcgtcggtctcaa 

aggtggggctggcaaaaccacgctgacagcagcgttggggtcgacgttggctcaggtgcgggccgaccggatcct 

ggctcta.ga 
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PE coding sequence (SEQ ID No 2) 

atggaaaaaatgtcacatgatccgatcgctgccgacattggcacgcaagtgagcgacaacgctctgcacggcgtg 
acggccggctcgacggcgctgacgtcggtgaccgggctggttcccgcgggggccgatgaggtctccgcccaagcg 
gcgacggcgttcacatcggagggcatccaattgctggcttccaatgcatcggcccaagaccagctccaccgtgcg 
ggcgaagcggtccaggacgtcgcccgcacctattcgcaaatcgacgacggcgccgccggcgtcttcgccgaa 



PPE coding sequence (SEQ ID No 3) 

atgctgtggcacgcaatgccaccggagctaaataccgcacggctgatggccggcgcgggtccggctccaatgctt 
gcggcggccgcgggatggcagacgctttcggcggctctggacgctcaggccgtcgagttgaccgcgcgcctgaac 
tctctgggagaagcctggactggaggtggcagcgacaaggcgcttgcggctgcaacgccgatggtggtctggcta 
caaaccgcgtcaacacaggccaagacccgtgcgatgcaggcgacggcgcaagccgcggcatacacccaggccatg 
gccacgacgccgtcgctgccggagatcgccgccaaccacatcacccaggccgtccttacggccaccaacttcttc 
ggtatcaacacgatcccgatcgcgttgaccgagatggattatttcatccgtatgtggaaccaggcagccctggca 
atggaggtctaccaggccgagaccgcggttaacacgcttttcgagaagctcgagccgatggcgtcgatccttgat 
cccggcgcgagccagagcacgacgaacccgatcttcggaatgccctcccctggcagctcaacaccggttggccag 
ttgccgccggcggctacccagaccctcggccaactgggtgagatgagcggcccgatgcagcagctgacccagccg 
ctgcagcaggtgacgtcgttgttcagccaggtgggcggcaccggcggcggcaacccagccgacgaggaagccgcg 
cagatgggcctgctcggcaccagtccgctgtcgaaccatccgctggctggtggatcaggccccagcgcgggcgcg 
ggcctgctgcgcgcggagtcgctacctggcgcaggtgggtcgttgacccgcacgccgctgatgtctcagctgatc 
gaaaagccggttgccccctcggtgatgccggcggctgctgccggatcgtcggcgacgggtggcgccgctccggtg 
ggtgcgggagcgatgggccagggtgcgcaatccggcggctccaccaggccgggtctggtcgcgccggcaccgctc 
gcgcaggagcgtgaagaagacgacgaggacgactgggacgaagaggacgactgg 



CFP-10 coding sequence (SEQ ID No 4) 

atggcagagatgaagaccgatgccgctaccctcgcgcaggaggcaggtaatttcgagcggatctccggcgacctg 
aaaacccagatcgaccaggtggagtcgacggcaggttcgttgcagggccagtggcgcggcgcggcggggacggcc 
gcccaggccgcggtggtgcgcttccaagaagcagccaataagcagaagcaggaactcgacgagatctcgacgaat 
attcgtcaggccggcgtccaatactcgagggccgacgaggagcagcagcaggcgctgtcctcgcaaatgggcttc 



ESAT-6 coding sequence (SEQ ID No 5) 

Atgacagagcagcagtggaatttcgcgggtatcgaggccgcggcaagcgcaatccagggaaatgtcacgtccatt 
cattccctccttgacgaggggaagcagtccctgaccaagctcgcagcggcctggggcggtagcggttcggaggcg 
taccagggtgtccagcaaaaatgggacgccacggctaccgagctgaacaacgcgctgcagaacctggcgcggacg 
atcagcgaagccggtcaggcaatggcttcgaccgaaggcaacgtcactgggatgttcgca 

CFP-10 + ESAT-6 (SEQ ID No 6) 

atggcagagatgaagaccgatgccgctaccctcgcgcaggaggcaggtaatttcgagcggat 
ctccggcgacctgaaaacccagatcgaccaggtggagtcgacggcaggttcgttgcagggcc 

agtggcgcggcgcggcggggacggccgcccaggccgcggtg 

gtgcgcttccaagaagcagccaataagcagaagcaggaactcgacgagatctcgacgaatat 
tcgtcaggccggcgtccaatactcgagggccgacgaggagcagcagcaggcgctgtcctcgc 
aaatgggcttctgacccgctaatacgaaaagaaacggagcaaaaacatgacagagcagcagt 
ggaatttcgcgggtatcgaggccgcggcaagcgcaatccagggaaatgtcacgtccattcat 
tccctccttgacgaggggaagcagtccctgaccaagctcgcagcggcctggggcggtagcgg 
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ttcggaggcgtaccagggtgtccagcaaaaatgggacgccacggctaccgagctgaacaacg 
cgctgcagaacctggcgcggacgatcagcgaagccggtcaggcaatggcttcgaccgaaggc 
aacgtcactgggatgttcgca 

Primer SP6-BAC1 

AGTTAGCTCACTCATTAGGCA (SEQ ID No7) 
Primer T7-BAC1 

GGATGTGCTGCAAGGCGATTA (SEQ ID No 8) 
primer esat-6F 

GTCACGTCCATTCATTCCCT (SEQ ID No 9) ; 
primer esat-6R 

ATCCCAGTGACGTTGCCTT) (SEQ ID No 10), 

primer RDl 11110 flanking region F 
GCAGTGCAAAGGTGCAGATA (SEQ ID No 11); 

primer RDl 11110 flanking region R 
GATTGAGACACTTGCCACGA (SEQ ID No 12) 

primer plcA, int . F 

CAAGTTGGGTCTGGTCGAAT (SEQ ID No 13) 
primer plcA. int . R 

GCTACCCAAGGTCTCCTGGT (SEQ ID No 14)) 
Sequences at the junction RDl nAC 

CAAC^CGkGGTTGTAAAACCTCGAC^ 

CGAG 7CCCATTTTGTCGCTGATTTGTTTGAACAGCGACGAACCGGTGTTGAAAATGTCGCCTGGGTCGGGGATTC 
40 CCT (SEQ ID NO 15) 

primer RDS^ flanking region F 

GAATGCCGACGTCATATCG (SEQ ID NO 16) 

45 

primer RDS"* 6 flanking region R 

CGGCCACTGAGTTCGATTAT (SEQ ID No 17) 

Sequence at the junction RD5 mc 

50 CCTCC^VTGAACCACCTC^CATC^CCC 

CCCCAGGTGTCCTGGGTCGTTCCGTTGACCGTCGAGTCCGAACATCCGTCATTCCCGGTGGCAGTCGGTGCGGTG 

AC (SEQ ID NO 18) 
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35 
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primer MiDl flanking region F 

CAGCCAACACCAAGTAGACG (SEQ ID No 19) 
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primer MiDl flanking region R 

TCT ACCTGCAGT CGCTT GTG (SEQ ID No 20) 

Sequence at the junction MiDl 

CACCT(^CAT<^CCCC^TCCTTTC^ 

CATCTTCTGGCAGATCAGCAGATCGCTTGTTCTCAGTGCAGGTGAGTC (SEQ ID No 21) 

primer MiD2 flanking region R 

GTC C ATCGAGG ATGTC GAGT (SEQ ID NO 22) 

primer MiD2 flanking region L 

CTAGGCCATTCCGTTGTCTG (SEQ ID NO 23) 

Sequence at the junction MiD2 

GC!K«TMTXC^TCAACrc^ 

GC^GGTTCXTAAAGGCTTCGAGACCGGACGGGCTGTAGGTTCCTCAACTGTGTGGCGGATGGTCTGAGCACTTAA 
C (SEQ ID No 24) 

primer MiD3 flanking region R 

GGCGACGCCATTTCC (SEQ ID No 25) 

primer Mi D3 flanking region L 

AACTGTCGGGCTTGCTCTT (SEQ ID NO 2 6) 

Seouence at: the junction MiD3 
TG^GCCC*^CCTCCGTTGCCACC^^ 

GGGCGGGGTTCGCCCTCAGCCGCTAAACACGCCGACCAAGATCAACGAGCTACCTGCCCGGTCAAGGTTGAAGAG 
CCCCCATATCAGCAAGGGCCCGGTGTCGGCG (SEQ ID No 27) 



1350639A1 I > 
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w 



15 



30 



40 



55 



SEQUENCE LISTING 

<110> INSTITUT PASTEUR 

<120> Identification of virulence associated regionB RD1 and 
RDS leading to improve vaccine of M. bovis BCG and M. 
microti 

<130> D20217 

<140> EP 02/290854 
<141> 2002-04-05 

<160> 27 

<170> Patentln Ver. 2.1 



<210> 1 
<211> 3909 
<212> DNA 

20 <213> Mycobacterium tuberculosis 

<220> 

<223> RD1-AP34 (a 3 909 bp fragment of the M. 
tuberculosis H37Rv genome) 

25 <400> 1 

gaattcccat ccagtgagtt caaggtcaag cggcgccccc ctggccaggc atttctcgtc 60 

tcgccagacg gcaaagaggt catccaggcc ccctacatcg agcctccaga agaagtgttc 12 0 

gcagcacccc caagcgccgg ttaagattat ttcattgccg gtgtagcagg acccgagctc 180 

agcccggtaa tcgagttcgg gcaatgctga ccatcgggtt tgtttccggc tataaccgaa 24 0 

cggtttgtgt acgggataca aatacaggga gggaagaagt aggcaaatgg aaaaaatgtc 300 

acatgatccg atcgctgccg acattggcac gcaagtgagc gacaacgctc tgcacggcgt 360 

gacggccggc tcgacggcgc tgacgtcggt gaccgggctg gttcccgcgg gggccgatga 42 0 

ggtctccgcc caagcggcga cggcgttcac atcggagggc atccaattgc tggcttccaa 480 

tgcatcggcc caagaccagc tccaccgtgc gggcgaagcg gtccaggacg tcgcccgcac 54 0 

ctattcgcaa atcgacgacg gcgccgccgg cgtcttcgcc gaataggccc ccaacacatc 600 

35 ggagggagtg atcaccatgc, tgtggcacgc aatgccaccg gagctaaata ccgcacggct 660 

gatggccggc gcgggtccgg ctccaatgct tgcggcggcc gcgggatggc agacgctttc 720 

ggcggctctg gacgctcagg ccgtcgagtt gaccgcgcgc ctgaactctc tgggagaagc 780 

ctggactgga ggtggcagcg acaaggcgct tgcggctgca acgccgatgg tggtctggct 84 0 

acaaaccgcg tcaacacagg ccaagacccg tgcgatgcag gcgacggcgc aagccgcggc 900 

atacacccag gccatggcca cgacgccgtc gctgccggag atcgccgcca accacatcac 960 

ccaggccgtc cttacggcca ccaacttctt cggtatcaac acgatcccga tcgcgttgac 1020 

cgagatggat tatttcatcc gtatgtggaa ccaggcagcc ctggcaatgg aggtctacca 10 8 0 

ggccgagacc gcggttaaca cgcttttcga gaagctcgag ccgatggcgt cgatccttga 1140 

tcccggcgcg agccagagca cgacgaaccc gatcttcgga acgccctccc ctggcagctc 1200 

aacaccggtt ggccagttgc cgccggcggc tacccagacc ctcggccaac tgggtgagat 1260 

45 gagcggcccg atgcagcagc tgacccagcc gctgcagcag gtgacgtcgt tgttcagcca 1320 

ggtgggcggc accggcggcg gcaacccagc cgacgaggaa gccgcgcaga tgggcctgct 13 80 

cggcaccagt ccgctgtcga accatccgct ggctggtgga tcaggcccca gcgcgggcgc 1440 

gggcctgctg cgcgcggagt cgctacctgg cgcaggtggg tcgttgaccc gcacgccgct 1500 

gatgtctcag ctgatcgaaa agccggttgc cccctcggtg atgccggcgg ctgctgccgg 1560 

atcgtcggcg acgggtggcg ccgctccggt gggtgcggga gcgatgggcc agggtgcgca 1620 

50 atccggcggc tccaccaggc cgggtctggt cgcgccggca ccgctcgcgc aggagcgtga 168 0 

agaagacgac gaggacgact gggacgaaga ggacgactgg tgagctcccg taatgacaac 174 0 

agacttcccg gccacccggg ccggaagact tgccaacatt ttggcgagga aggtaaagag 1800 

agaaagtagt ccagcatggc agagatgaag accgatgccg ctaccctcgc gcaggaggca 1860 

ggtaatttcg agcggatctc cggcgacctg aaaacccaga tcgaccaggt ggagtcgacg 1920 

gcaggttcgt tgcagggcca gtggcgcggc gcggcgggga cggccgccca ggccgcggtg 1980 

gtgcgcttcc aagaagcagc caataagcag aagcaggaac tcgacgagat ctcgacgaat 204 0 
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attcgtcagg ccggcgtcca atactcgagg gccgacgagg agcagcagca ggcgctgtcc 2100 
tcgcaaatgg gcttctgacc cgctaatacg aaaagaaacg gagcaaaaac atgacagagc 2160 
agcagtggaa tttcgcgggt atcgaggccg cggcaagcgc aatccaggga aatgtcacgt 2220 
s ccattcattc cctccttgac gaggggaagc agtccctgac caagctcgca gcggcctggg 2280 

gcggtagcgg ttcggaggcg taccagggtg tccagcaaaa atgggacgcc acggctaccg 2340 
agctgaacaa cgcgctgcag aacctggcgc ggacgatcag cgaagccggt caggcaatgg 2400 
cttcgaccga aggcaacgtc actgggatgt tcgcata 9 gg caacgcogag ttcgcgtaga 2460 
atagcgaaac acgggatcgg gcgagttcga ccttccgtcg gtcccgccct tcctcgtgtt 2520 
tatacgtttg agcgcactct gagaggttgt catggcggcc gactacgaca agctcttccg 2560 
10 gccgcacgaa ggtatggaag ctccggacga tatggcagcg cagccgttct tcgaccccag 2640 

tgcttcgttt ccgccggcgc ccgcatcggc aaacctaccg aagcccaacg gccagactcc 2700 
gcccccgacg tccgacgacc tgtcggagcg gttcgtgtcg gccccgccgc cgccaccccc 2760 
acccccacct ccgcctccgc caactccgat gccgatcgcc gcaggagagc cgccctcgcc 2820 
ggaaccggco gcatctaaao cacccacacc ccccatgccc atcgccggac ccgaaccggc 2880 
I5 cccacccaaa ccaccoacac occccatgcc catcgccgga cccgaaccgg ccccacccaa 2940 

accacccaca cctccgatgc ocatcgccgg acctgcaccc accccaaccg aatcccagtt 3000 
agcgcccccc agaccaccga caccacaaac gccaaccgga gcgccgcagc aaccggaatc 3060 
accggcgccc cacgtaccct cgcacgggcc acatcaaccc cggcgcaccg caccagcacc 3120 
gccctgggca aagatgccaa tcggcgaacc cccgcccgct ccgtccagac cgtctgcgtc 3160 
cccggclgaa ccaccgaccc ggcctgcccc ccaacactcc cgacgtgcgc gccggggtca 3240 
20 ccgctatcgc acagacaccg aacgaaacgt cgggaaggta gcaactggtc catccatcca 3300 

qgcgcggctg cgggcagagg aagcatccgg cgcgcagcto gcccccggaa cggagccctc 3360 
gccagcjccg t^gggccall cgagatcgta tctggctccg cccacccgcc ccgcgccgac 3420 
IgaaccLcc cccagcccct cgccgcagcg caactccggt cggcgtgccg agcgacgcgt 3480 
ccaccccgat ttagccgccc aacatgccgc ggcgcaacct gattcaatta cggccgcaac 3540 
cactggcggt cgtcgccgca agcgtgcagc gccggatctc gacgcgacac agaaatcctt 3600 
aaggllggcg gccaaggggc cgaaggtgaa gaaggtgaag ccccagaaac cgaaggccac 3660 
gaagccgccc aaagtggtgt cgcagcgcgg ctggcgacat tgggtgcatg =9«gacgcg 372 0 
aatcaacctg ggcctgtcac ccgacgagaa gtacgagctg gacctgcacg ctcgagtccg 3780 
ccgcaatccl Scgggtcgt atcagatcgc cgtcgtcggt ctcaaaggtg gggctggcaa 3840 
aaccacgctg acagcagcgt tggggccgac gttggctcag gtgcgggccg accggatcct 39O0 
ggctctaga 
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<210> 2 
<211> 297 
<212> DNA 

35 <213> Mycobacterium tuberculosis 

<220> 

<223> PE coding sequence 



40 



45 



50 



55 



<400> 2 /-a 
atggaaaaaa tgtcacatga tccgatcgct gccgacattg gcacgcaagt sagcgacaac 60 
gctctgcacg gcgtgacggc cggctcgacg gcgctgacgt cggtgaccgg gctggttccc 120 
gcggglgccl atgaggtctc cgcccaagcg gcgacggcgt tcacatcgga 99gcatccaa 0 
24cSIc« ccaaScatc ggcccaagac cagctccacc gtgcgggcga agcggtccag 240 
gacgtcgccc gcacctattc gcaaatcgac gacggcgccg ccggcgtctt cgccgaa 297 

<210> 3 
<211> 1104 
<212> DNA 

<213> Mycobacterium tuberculosis 
<220> 

<223> PPE coding sequence 

atgctgtggc acgcaatgcc accggagcta aataccgcac ggctgatggc cggcgcgggt 60 
ccggctccaa tgcttgcggc ggccgcggga tggcagacgc tttcggcggc tctggacgct 120 
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caggccgtcg agttgaccgc gcgcctgaac tctctgggag aagcctggac tggaggtggc 180 

agcgacaagg cgcttgcggc tgcaacgccg atggtggtct ggctacaaac cgcgtcaaca 240 

caggccaaga cccgtgcgat gcaggcgacg gcgcaagccg cggcatacac ccaggccatg 30 0 

gccacgacgc cgtcgctgcc ggagatcgcc gccaaccaca tcacccaggc cgtccttacg 360 

gccaccaact tcttcggtat caacacgatc ccgatcgcgt tgaccgagat ggattatttc 42 0 

atccgtatgt ggaaccaggc agccctggca atggaggtct accaggccga gaccgcggtt 48 0 

aacacgcttt tcgagaagct cgagccgatg gcgtcgatcc ttgatcccgg cgcgagccag 54 0 

agcacgacga acccgatctt cggaatgccc tcccctggca gctcaacacc ggttggccag 600 

ttgccgccgg cggctaccca gaccctcggc caactgggtg . agatgagcgg cccgatgcag €60 

cagctgaccc agccgctgca gcaggtgacg tcgttgttca gccaggtggg cggcaccggc 72 0 

ggcggcaacc cagccgacga ggaagccgcg cagatgggcc tgctcggcac cagtccgctg 7B0 

tcgaaccatc cgctggctgg tggaccaggc cccagcgcgg gcgcgggcct gctgcgcgcg 84 0 

gagtcgctac ctggcgcagg tgggtcgttg acccgcacgc cgctgatgtc tcagctgatc 900 

gaaaagccgg ttgccccctc ggtgatgccg gcggctgctg ccggatcgtc ggcgacgggt 960 

ggcgccgctc cggtgggtgc gggagcgatg ggccagggtg cgcaatccgg cggctccacc 1020 

aggccgggtc tggtcgcgcc ggcaccgctc gcgcaggagc gtgaagaaga cgacgaggac 10 8 0 

gactgggacg aagaggacga ctgg 1104 



<210> 4 
<211> 300 
<212> DHA 

<213> Mycobacterium tuberculosis 

<220> 

<223> CFP-10 coding sequence 

<400> 4 

atggcagaga tgaagaccga tgccgctacc 
atctccggcg acctgaaaac ccagatcgac 
ggccagtggc gcggcgcggc ggggacggcc 
gcagccaata agcagaagca ggaactcgac 
gtccaatact cgagggccga cgaggagcag 



ctcgcgcagg aggcaggtaa tttcgagcgg 60 

caggtggagt cgacggcagg ttcgttgcag 120 

gcccaggccg cggtggtgcg cttccaagaa 18 0 

gagatctcga cgaatattcg tcaggccggc 24 0 

cagcaggcgc tgtcctcgca aatgggcttc 30 0 



<210> 5 
<211> 285 
<212> DNA 

<213> Mycobacterium tuberculosis 
<220> 

<22 3> ESAT-6 coding sequence 



<:400> 5 

atgacagagc agcagtggaa tttcgcgggt atcgaggccg cggcaagcgc aatccaggga 60 

aatgtcacgt ccattcattc cctccttgac gaggggaagc agtccctgac caagctcgca 12 0 

gcggcctggg gcggtagcgg ttcggaggcg taccagggtg tccagcaaaa atgggacgcc 180 

acggctaccg agctgaacaa cgcgctgcag aacctggcgc ggacgatcag cgaagccggt 240 

caggcaatgg cttcgaccga aggcaacgtc actgggatgt tcgca 2 85 

<210> S 

<211> 620 

<212> DNA 

<213> Mycobacterium tuberculosis 
<220> 

<223> CFP-10 + ESAT-6 

<400> 6 

atggcagaga tgaagaccga tgccgctacc ctcgcgcagg aggcaggtaa tttcgagcgg 60 



27 



1350S39A1_1_> 



EP 1 350 839 A1 



atctccggrg acctgaaaac 
ggccagtggc gcggcgcggc 
gcagccaata agcagaagca 

5 gtccaatact cgagggccga 

tgacccgcta atacgaaaag 
cgggtatcga ggccgcggca 
ttgacgaggg gaagcagtcc 
aggcgtacca gggtgtccag 
tgcagaacct ggcgcggacg 

10 acgtcactgg gatgttcgca 



ccagatcgac caggtggagt 
ggggacggcc gcccaggccg 
ggaactcgac gagatctcga 
cgaggagcag cagcaggcgc 
aaacggagca aaaacatgac 
agcgcaatcc agggaaatgt 
ctgaccaagc tcgcagcggc 
caaaaatggg acgccacggc 
atcagcgaag ccggtcaggc 



cgacggcagg ttcgttgcag 12 0 
cggtggtgcg cttccaagaa 180 
cgaatattcg tcaggccggc 240 
tgtcctcgca aatgggcttc 300 
agagcagcag tggaatttcg 360 
cacgtccatt cattccctcc 420 
ctggggcggt agcggttcgg 460 
taccgagctg aacaacgcgc 54 0 
aatggcttcg accgaaggca 600 

620 



<210> 7 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<s220> 

<223> Description of Artificial Sequence: Primer 
SP6-BAC1 

<400> 7 

agttagctca ctcattaggc a 



<210> 8 

<211> 21 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Primer T7-BAC1 

<400> 8 

ggatgtgctg caaggcgatt a 



35 <210> 9 

<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

40 <223> Description of Artificial Sequence: Primer esat-6F 

<400> 9 

gtcacgtcca ttcattccct 



<210> 10 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
50 <220> 

<223> Description of Artificial Sequence: Primer esat-6R 
<400> 10 

atcccagtga cgttgcctt 
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10 



35 



40 



<210> 11 
<211^ 20 
<:212> DNA 

<2\2> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer RDl mic 
flanking region F 

<400> 11 

gcagtgcaaa ggtgcagata 20 



<210> 12 

15 <211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer RDl" 110 

20 f lanKing region R 

<400> 12 

gattgagaca cttgccacga 20 

25 <210> 13 

<211> 20 _ 
<212> DNA 

<213> Artificial Sequence 

30 <220> 

<223> Description of Artificial Sequence: Primer 
plcA. int . F 

<400> 13 

caagttgggt ctggtcgaat 2 0 

<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
plcA. int . R 

45 <400> 14 

gctacccaag gtctcctggt 2 0 

<210> 15 

<211> 153 

50 <212> DNA 

<213> Mycobacterium tuberculosis 

<220> 



55 



<223> Sequences at the junction RD1 

<400> 15 



mic 
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caagacgagg ttgtaaaacc tcgacgcagg atcggcgatg aaatgccagt cggcgtcgct 60 
gagcgcgcgc tgcgccgagt cccattttgt cgctgatttg tttgaacagc gacgaaccgg 120 
tgttgaaaat gtcgcctggg tcggggattc cct 153 



<210> 16 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer RD5 
flanking region F 

<400> 16 

gaatgccgac gtcatatcg 



<210> 17 

<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
flanking region R 

<4O0> 17 

cggccactga gttcgattat 



Sequence: Primer RD5 



20 



<210> 18 
<211> 152 
<212> DNA 

<213> Mycobacterium tuberculosis 

<220> # mic 

<223> Sequence at t he , junction RDS 

<4O0> 18 

cctcgatgaa ccacctgaca tgaccccatc ctttccaaga actggagtct ccggacatgc 60 

cggggcggtt cactgcccca ggtgtcctgg gtcgttccgt tgaccgtcga gtccgaacat 120 

ccgtcattcc cggtggcagt cggtgcggtg ac 152 



<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> . 
<223> Description of Artificial Sequence: Primer MiDl 

flanking region F 
<400> 19 

cagccaacac caagtagacg 



<210> 20 
<211> 20 
<212> DNA 
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10 



15 



20 



40 



45 



50 



<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer MiDl 
flanking region R 

<400> 20 

tctacctgca gtcgcttgtg 20 



<210> 21 
c211> 123 
<212> DMA 

<213> Mycobacterium tuberculosis 

<220> 

<223> Sequence at the junction MiDl 
<400> 21 

cacctgacat gaccccatcc tttccaagaa ctggagtctc cggacatgcc ggggcggttc 60 

agggacattc atgtccatct tctggcagat cagcagatcg cttgttctca gtgcaggtga 12 0 
gtc 123 



<210> 22 

25 <211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer MiD2 

30 flanking region P. 

<400> 22 

gtccatcgag gatgtcgagt 20 

35 c210> 23 

<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Primer MiD2 
flanking region L 

<400> 23 

ctaggccatt ccgttgtctg 20 

<210> 24 
<211> 151 
<212> DNA 

<213> Mycobacterium tuberculosis 

<220> 

<223> Sequence at the junction MiD2 
<400> 24 

55 gctgcctact acgctcaacg ccagagacca gccgccggct gaggtctcag atcagagagt 60 

ccccggactc accggggcgg ttcataaagg cttcgagacc ggacgggctg taggttcctc 120 
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aactgtgtgg cggatggtct gagcacttaa c 151 



<210> 25 
<211> 15 
<212> DNA 

<213> Artificial Sequence 

10 <220> 

<223> Description of Artificial Sequence; Primer MiD3 

flanking region R 



15 



30 



35 



40 



<400> 25 

ggcgacgcca tttcc 15 



<210> 26 
<211> 19 
<212> DNA 

20 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Primer MiD3 
flanking region L 

25 <400> 26 

aactgtcggg cttgctctt 19 



<210> 27 

<2ia> 181 

<212> DNA 

<213> Mycobacterium tuberculosis 
<220> 

<223> Sequence at the junction MiD3 
<400> 27 

tggcgccggc acctccgttg ccaccgttgc cgccgctggt gggcgcggtg ccgttcgccc 60 
cggccgaacc gttcagggcc gggttcgccc tcagccgcta aacacgccga ccaagatcaa 120 
cgagctacct gcccggtcaa ggttgaagag cccccatatc agcaagggcc cggtgtcggc 180 

g 181 
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Claims 

1 . A strain of M. bovis BCG or M. microti, wherein said strain has integrated all or part of the RD1 region responsible 
for enhanced immunogenicity and increased persistence of BCG to the tubercle bacilli. 

5 

2. A strain according to claim 1 which has integrated a portion of DN A originating from Mycobacterium tuberculosis, 
which comprises at least one gene selected from Rv3872 (SEQ ID No 2, mycobacterial PE), Rv3873 (SEQ ID No 
3 3 PPE) : Rv3874 (SEQ ID No 4, CFP-10), and Rv3875 (SEQ ID No 5, ESAT-6). 

10 3. a strain according to claim 1 which has integrated a portion of DNA originating from Mycobacterium tuberculosis, 
which comprises Rv3875 (SEQ ID No 5, ESAT-6). 

4. A strain according to claim 1 which has integrated a portion of DNA originating from Mycobacterium tuberculosis, 
which comprises Rv3874 (SEQ ID No 4, CFP-10). 

15 

5. A strain according to claim 1 which has integrated a portion of DNA originating from Mycobacterium tuberculosis, 
which comprises both Rv3875 (SEQ ID No 5, ESAT-6) and (SEQ ID No 4, CFP-10). 

6. A strain according to one of claims 2 to 5, wherein the coding sequence of the integrated gene is in frame with its 
20 natural promoter or with an exogenous promoter, such as a promoter capable of directing high level of expression 

of said coding sequence. 

7. A strain according to one of claims I to 5, wherein said the integrated gene is mutated so as to maintain the improved 
immunogenicity while decreasing the virulence of the strain. 

25 

8. A strain according to claim 7, wherein said strain only carries parts of the genes coding for ESAT-6 or CFP-1 0 in 
a mycobacterial expression vector under the control of a promoter, more particularly an hsp60 promoter. 

9. A strain according to claim 8, wherein said strain carries at least one portion of the esat-6 gene that codes for 
30 immunogenic 20-mer peptides of ESAT-6 active as T-cell epitopes. 

10. A strain according to claim 7, wherein the esat-6 and CFP-10 encoding genes are altered by directed mutagenesis 
in a way that most of the immunogenic peptides of ESAT-6 remain intact, but the biological functionality of ESAT- 
6 is lost. 

11. M. bovis BCG::RD1 strains which have integrated acosmid herein referred as RD1-2F9 and RD1-AP34 contained 
in the E. coli strains deposited at the CNCM under the accession number 1-2831 and I-2832 respectively 

12. M. bovis BCG::RD1 strain which has integrated the construct RD1-AP34 which contains a 3909 bp fragment of 
the M. tuberculosis H37Rv genome from region 4350459 bp to 4354367 bp cloned (SEQ ID No 1). 

13. M. bovis BCG::RD1 strain which has integrated the fragment RD1-2F9 (~~ 32 kb) that covers the region of the M. 
tuberculosis genome AL1 23456 from ca 4337 kb to ca. 4369 kb. 

^ 14. M. microti::HD'\ strain which has integrated the construct RD1-AP34 which contains a 3909 bp fragment of the M. 
tuberculosis H37Rv genome from region 4350459 bp to 4354367 bp cloned (SEQ ID No 1). 

15. M. m/crofr:RD1 strain which has integrated the fragment RD1-2F9 (~~ 32 kb) that covers the region of the M. 
tuberculosis genome AL1 23456 from ca 4337 kb to ca. 4369 kb. 

50 

16. A method for preparing and selecting improved M. bovis BCG or M. microti strains defined in one of claims 1 to 
15 comprising a step consisting of modifying said strains by insertion, deletion or mutation in the integrated DR1 
region, more particularly in the esat-6 or CFP-10 gene, said method leading to strains that are less virulent for 
immuno-depressed individuals. 

55 

17. A cosmid or a plasmid comprising all or part of the RD1 region originating from Mycobacterium tuberculosis, said 
region comprising at least one gene selected from Rv3872 (mycobacterial PE), Rv3873 (PPE), Rv3874 (CFP-10), 
and Rv3875 (ESAT-6). 
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18. A cosrnid or a plasmid according to claim 17 comprising CFP-1 0, ESAT-6 or both or a part of them. 

19. A cosrnid or a plasmid according to claim 1 8 comprising a mutated gene selected CFP-1 0, ESAT-6 or both., said 
mutated gene being responsible for the improved immunogenicity and decreased virulence. 

20. Use of a cosrnid or a plasmid according to one of claims 17 to 1 9 for transforming M. bovis BCG or M. microti 

21. A pharmaceutical composition comprising a strain according to one of claims 1 to 15 and a pharmaceutical^ 
acceptable carrier. 

22. A pharmaceutical composition according to claim 21 containing suitable pharmaceutically-acceptable carriers com- 
prising excipients and auxiliaries which facilitate processing of the living vaccine into preparations which can be 
used pharmaceutically. 

is 23. A pharmaceutical composition according to claim 21 or 22 which is suitable for intravenous or subcutaneous ad- 
ministration. 

24. A vaccine comprising a strain according to one of claims 1 to 15 and a suitable carrier. 

20 25. A product comprising a strain according to one of claims 1 to 15 and at least one protein selected from ESAT-6 
and CFP-1 0 or epitope derived thereof for a separate, simultaneous or sequential use for treating tuberculosis. 

26. The use of a strain according to one of claims 1 to 1 5 for preparing a medicament or a vaccine for preventing or 
treating tuberculosis. 



25 



27. The use of a strain according to one of claims 1 to 15 as an adjuvant/immunomodulator for preparing a medicament 
for the treatment of superficial bladder cancer. 

28. A method for the identification at the species level of members of the M. tuberculosis complex by means of markers 
30 for RD1 mic and RD5 mlc as molecular diagnostic test. 

29. A method according to claim 28 comprising the use of a primer selected from : 



35 



40 



45 



50 



55 



primer esat-6F GTCACGTCCATTCATTCCCT (SEQ TD No 9), 

primer esat-6R ATCCCAGTGACGTTGCCTT) (SEQ ID No 10), 
primer RD1 mic flanking region F GCAGTGCAAAGGTGCAG ATA (SEQ ID No 1 1), 
primer RDl mic flanking region R GATTG AG ACACTTG CCACG A (SEQ ID No 12), 
primer RD5 mic flanking region F G AATG CCG ACGTC ATATCG (SEQ ID No 1 6), 

primer RD5 mic flanking region R CGG CCA CTG AGTTCG A TT AT (SEQ ID No 17) 

and the complementary sequences of said primers. 

30. A diagnostic kit for the identification at the species level of members of the M. tuberculosis comprising DNA probes 
and primers specifically hybridizing to a DNA portion of the RD1 or RD5 region of M . tuberculosis, more particularly 
probes hybridizing understringent conditions toa gene selected from Rv3871 , Rv3872 (mycobacterial PE), Rv3873 
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(PPE), Rv3874 (CFP-10), Rv3875 (ESAT-6), and Rv3876, preferably CFP-10 and ESAT-6 
31. A diagnostic kit according to claim 30 comprising a probe or primer selected from : 

esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9), 

esat-6R ATCCC A G TG A CG TTG C CTT) (SEQ ID No 10), 

RDl mic flanking region F GCAGTGCAAAGGTGCAGATA (SEQ ID No 1 1), 

RDl m,c flanking region R GATTGAGACACTTGCCACGA (SEQ ID No 12), 
RD5 mic flanking region F GAATGCCGACGTCATATCG (SEQ ID No 16), 
RD5 mic flanking region R CGGCCACTG AGTl CG ATTAT (SEQ ID No 17). 



32. A diagnostic kit for the identification at the species level of members of the M. tuberculosis comprising antibodies 
25 directed to mycobacterial PE, PPE, CFP-1 0 and ESAT-6. 

33. Virulence markers associated with RD1 and/or RD5 regions of the genome of M. tuberculosis or a part of these 
regions. 

30 



55 



35 



BNSDOCID: <EP 1350839A1_I_> 



EP 1 350 839 A1 




BNSDOCID: <EP. 



1350839A1_L> 



EP 1 350 839 A1 



CO 
< 



CD 



O 

q: 



in 

CO 
CL 
< 



cn 
u_ 

CM 
i 

5 



© 



to 
in 



8 



IT) 
CO 




~ • • , - . 
. ••.•.•c-.i.rr' 

V„-.> *.: :v 
*.■».. " % .- 
-* > S> : - 



:.:-'.>;.y'k- 




CO 

ro 



eo 

CO 




to 



37 



BNSDOCID: <EP 



.1350839A1_L> 



EP 1 350 839 A1 



FIGURE 2A 

BCG::RD1-2F9 
BCG:.-RD1-I106 

BCG::pYUB412 

FIGURE 2B 
FIGURE 2C 




BCG::RD1-I106 



BCG::pYUB412 



H37Rv BCG Cyt Mem CW 



FIGURE 2D 




38 

BNSDOCID <EP 1350B39A1J_* 



EP 1 350 839 A1 



FIGURE 3A 



Spleen 



Lung 



10' 



o 

b 10 



10" 




FIGURE 3B 



10' 



« 10* 



10' 



50 



Spleen 




Days after Infection 




10" 



10 



10" 



J 10' 

21 106 

Days after infection 



Lung 



60 




FIGURE 3C 10 « 



Spleen 



Lung 



10 



I- 

10* 



10* 



FIGURE 3D 




14 



BCG:J^D1-2F9 



BCG::pYUB412 



10' 



10* 



10" 



10" 



10* 



21 




Days a her infection 



BCG::RDS-I301 




39 



EP 1 350 839 A1 



FIGURE 4A 
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□ Only part of the claims have been paid within the prescribed time limit. The present European search 
report has been drawn up for the first ten claims and for those claims lor which claims fees have 
been paid, namely claim(s): 



□ No claims fees have been paid within the prescribed time limit. The present European search report has 
been drawn up for the first ten claims. 



LACK OF UNITY OF INVENTION 

The Search Division considers that the present European patent application does not comply with the 
requirements of unity o1 invention and relates to several inventions or groups of inventions, namely: 



see sheet B 



□ All further search fees have been paid within the fixed time limit. The present European search report ha; 
been drawn up for all claims. 

I — I As all searchable claims could be searched without effort justifying an additional fee, the Search Division 



LJ did not invite payment of any additional fee. 

□ Only part of the further search fees have been paid within the fixed time limit. The present European 
search report has been drawn up for those parts of the European patent application which relate to the 
inventions in respect of which search fees have been paid, namely claims: 



None of the further search fees have been paid within the fixed time limit. The present European search 
report has been drawn up for those parts of the European patent application which relate to the inventioi 
first mentioned in the claims, namely cfaims: 




1-6, 11, 17, 18, 20-27 (all partial), 13 (complete) 



52 



BNSDOCID. <£P 1350839A1_1_> 



EP 1 350 839 A1 




European Patent 
Office 



LACK OF UNITY OF INVENTION 
SHEET B 



Application Number 

EP 02 29 G864 



The Search Divrsion considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 

1. Claims: 1-6, 11, 17, 18, 20-27 (all partial), 13 (complete) 



Recombinant M. bovis BCG strain transformed with a construct 
comprising the entire RD1 region (RD1-2F9), cosmids and 
plasmids comprising said region and their use for the 
transformation of M. bovis BCG, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/immunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



2. Claims: 1, 2, 6, 11-13, 17 and 20-27 (all partial) 

Recombinant M. bovis BCG strain transformed with a construct 
comprising part of the RD1 region, i.e. a nucleic acid 
sequence according to Sequence Id No. 2, cosmids and plasmids 
comprising said nucleic acid sequence and their use for the 
transformation of M. bovis BCG, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/immunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



3. Claims: 1, 2, 6, 11-13, 17 and 20-27 (all partial) 

Recombinant M. bovis BCG strain transformed with a construct 
comprising part of the RD1 region, i.e. a nucleic acid 
sequence according to Sequence Id Mo. 3, cosmids and plasmids 
comprising said nucleic acid sequence and their use for the 
transformation of M. bovis BCG, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/ irrmunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



4. Claims: 1, 2, 11-13, 17, 

18 and 20-27 (all partial) and 4 (complete) 
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Recombinant M. bovis BCG strain transformed with a construct 
comprising part of the RD1 region, i.e. a nucleic acid 
sequence according to Sequence Id No. 4, cosmids and plasmids 
comprising said nucleic acid sequence and their use for the 
transformation of M. bovis BCG, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/irrmunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



5. Claims: 1, 2, 11-13, 17 

18 and 20-27 (all partial) and 3 (complete) 

Recombinant M. bovis BCG strain transformed with a construct 
comprising at least a nucleic acid sequence according to 
Sequence Id No. 5, cosmids and plasmids comprising said 
nucleic acid sequence and their use for the transformation 
of M. bovis BCG, pharmaceutical compositions and vaccines 
comprising said strain and use of said strain for preparing 
a medicament or vaccine for preventing or treating 
tuberculosis or as adjuvant/inmunomodulator for preparing a 
medicament for the treatment of superficial bladder cancer. 

6. Claims: 1, 6, 11-13, 17, 

18 and 20-27 (all partial) and 5 (complete) 

Recombinant M. bovis BCG strain transformed with a construct 
comprising part of the RD1 region, i.e. nucleic acid 
sequences according to Sequence Id No. 4 and 5, cosmids and 
plasmids comprising said nucleic acid sequence and their use 
for the transformation of M. microti, pharmaceutical 
compositions and vaccines comprising said strain and use of 
said strain for preparing a medicament or vaccine for 
preventing or treating tuberculosis or as 
adjuvant/inmunomodulator for preparing a medicament for the 
treatment of superficial bladder cancer. 



7. Claims: 1-6, 11, 17, 

18 and 20-27 (all partial) and 12 (complete) 

Recombinant M. bovis BCG strain transformed with a construct 
comprising part of the RD1 region (RD1-AP34, Seq Id No. 1), 
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cosmids and plasmids comprising said region and their use 
for the transformation of M. bovis BCG, pharmaceutical 
compositions and vaccines comprising said strain and use of 
said strain for preparing a medicament or vaccine for 
preventing or treating tuberculosis or as 
adjuvant/imnunomodulator for preparing a medicament for the 
treatment of superficial bladder cancer. 



8. Claims: 1-6, 17, 

18 and 20-27 (all partial) and 15 (complete) 



Recombinant M. microti strain transformed with a construct 
comprising the entire RD1 region (RD1-2F9), cosmids and 
plasmids comprising said nucleic acid sequence and their use 
for the transformation of M. microti, pharmaceutical 
compositions and vaccines comprising said strain and use of 
said strain for preparing a medicament or vaccine for 
preventing or treating tuberculosis or as 
adjuvant/iirinunomodulator for preparing a medicament for the 
treatment of superficial bladder cancer . 



9. Claims: 1, 2 , 6, 14, 15, 17 and 28-27 (all partial) 

Recombinant M. microti strain transformed with a construct 
comprising p^rt fo the RD1 region, i.e. a nucleic acid 
sequence according to Sequence Id No. 2, cosmids and 
plasmids comprising said nucleic acid sequence and their use 
for the transformation of M. microti, pharmaceutical 
compositions and vaccines comprising said strain and use of 
said strain for preparing a medicament or vaccine for 
preventing or treating tuberculosis or as 
adjuvant/iniTiunomodulator for preparing a medicament for the 
treatment of superficial bladder cancer. 



10. Claims: 1, 2 » 6, 14, 15, 17 and 20-27 (all partial) 

Recombinant M. microti strain transformed with a construct 
comprising a part of RD1, i.e. a nucleic acid sequence 
according to Sequence Id No. 3, cosmids and plasmids 
comprising said nucleic acid sequence and their use for the 
transformation of M. microti, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/inrnunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 
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11. Claims: 1, 2, 6, 14, 15, 17. 18, 

20-27 (all partial) and 4 (complete) 

Recombinant M. microti strain transformed with a construct 
comprising a part of RD1, i.e. a nucleic acid sequence 
according to Sequence Id No. 4, cosmids and plasmids 
comprising said nucleic acid sequence and their use for the 
transformation of M. microti, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/inmunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



12. Claims: 1, 2, 6, 14, 15, 17. 18, 

20-27 (all partial) and 3 (complete) 

Recombinant M. microti strain transformed with a construct 
comprising a part of RD1, i.e. a nucleic acid sequence 
according to Sequence Id No. 5, cosmids and plasmids 
comprising s^id nucleic acid sequence and their use for the 
transformation of M. microti, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/imnunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



13. Claims: 1, 6, 14, 15, 17, 18, 

20-27 (all partial) and 5 (complete) 



Recombinant M. microti strain transformed with a construct 
comprising a part of RD1, i.e. a nucleic acid sequence 
according to Sequence Id No. 4 and 5, cosmids and plasmids 
comprising said nucleic acid sequence and their use for the 
transformation of M. microti, pharmaceutical compositions 
and vaccines comprising said strain and use of said strain 
for preparing a medicament or vaccine for preventing or 
treating tuberculosis or as adjuvant/inmunomodulator for 
preparing a medicament for the treatment of superficial 
bladder cancer. 



56 



BNSDOCID: <EP. 



1350839A1_1_> 



EP 1 350 839 A1 




European Patent 
Office 



LACK OF UNITY OF INVENTION 
SHEET B 



EP 02 29 0864 



Application Number 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 



14. Claims: 1-6, 17, 18, 20-27 (all partial) and 14 (complete) 

Recombinant M. microti strain transformed with a construct 
comprising part of the RD1 region (RD1-AP34, Seq Id No. 1) , 
cosmids and plasmids comprising said region and their use 
for the transformation of M. microti, pharmaceutical 
compositions and vaccines comprising said strain and use of 
said strain for preparing a medicament or vaccine for 
preventing or treating tuberculosis or as 
adjuvant/irnmmomodulator for preparing a medicament for the 
treatment of superficial bladder cancer. 



15. Claims: 28-33 

Method for the identification at the species level of 

members of the M. tuberculosis complex by means of markers 

of the RD1 and RD5 region, resp., which are specific for M. 
mi croti . 
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